Notebook on Programming: June 2008

Monday, June 30, 2008

An Introduction to Designing Classes in C# - Pt.2

Programming Language: C#
Type: Tutorial
Level: Beginners
Topic Category: General Programming
Main Series: Designing and Implementing Classes in C#
Topic Title: 1) An Introduction to Designing Classes in C# - Pt.2

Quick Review
In part 1 of this article, we discussed classes and how to design them by trying to answer the question of "What is it to declare/define a class?". We established (I hope) what classes are and what they are used for and what they consist of. We also distinguished between the informational and operational features of classes.

Introduction and familiarization
In summary, a class definition consists of defining the following members:

Fields: describing the informational features of the class that must be stored (some informational features are calculated so they do not need to be stored)
Properties: aliases for the informational features supporting doing some operations before retrieving or storing a value in the fields. They are also used for calculated informational features.
Methods: represent the operations the class can do or the operations an object of the class can do.
Events: a way to make objects of the class capable of informing other pieces of code of any event of interest.
Class definitions may also contain definition of other types which we call "nested types"

OK. That's fine, now how can I define a class using C#?
The simplest class declaration in C# consists of the keyword "class" followed by the name of the class followed by a pair of curly brackets "{}". For example, for the mail package class:

class MailPackage{
}

However, the complete syntax is:

access-modifier [static|abstract|sealed] class class-name [: [parentClass][, implementedInterface]*]

To fully understand this syntax, let's agree on the syntax description language I will be using to the end:

Italic identifiers means you need to supply it (for instance, the name of the class, its parent, etc.)
Lower case colored blue words: C# keywords
Lower case uncolored words: another coding element that needs further description (for example, the access-modifier)
Things in square brackets: optional items (might be included or not)
Things separated with a pipe (|) character: either the first or the second but not both.
Two things after each other, each being in square brackets: either the first or the second or both.
The * symbol after an item: zero or more instances of this item can be repeated
The + symbol after an item: one or more instances of this item can be repeated

access-modifier can be one of:

public
internal

OK, One at a time, now what are access modifiers and what do they do?
Modifiers are keywords that when introduced before a declaration, it modifies the way the declaration will normally behave. Access modifiers are modifiers related to the accessibility of a declared item. Access modifiers are used both with classes and members of classes. The decide what code can "access" the class or member. By access we mean use in the possible ways, for example, accessing a class is being able to use its name to call methods or to declare and instantiate objects. Accessing a field on the other hand means the ability to write to and read from that field. Accessing a method means being able to call the method and execute it, and accessing a property means the ability to read from and (if possible) write to the property. Finally, accessing an event is being able to attach event handlers to that event.
For classes not declared within other classes, there are two levels of accessibility: public and internal. Public classes are accessible in each and every code there is. That means you can use the class within the same assembly, in another assembly, in another country ... whatever and wherever! Internal classes can be accessed only within the same assembly it is in. For example, you can not create an object from a class that was declared internal in its assembly which is already compiled.
Do I have to specify an access modifier? and how do I decide which one to use?
Well, if you don't supply an access modifier, C# assumes it is internal making your class inaccessible anywhere outside the assembly that it belongs to. It is a good practice though to explicitly state the access modifier for you class even if it is internal. The code becomes clearer and it will be less likely that you'll accidentally hide a class that you want public.
As to how you choose, it depends. Usually, classes that are internal to the implementation of an application are declared internal. The rule of thumb is, declare all classes internal unless you want to access them from outside the assembly or you want other people to be able to. Usually, developers who develop class library for other developers to use (or even for themselves to use later on) use both public and internal classes. Application developers, especially when designing a single library unit/application/executable use all internal classes.
You said access modifier apply also to members of classes, how is that?
When you declare members, all of them can have access modifiers and have a default access level. Most members can be one of five access levels defined by the following access modifiers (ordered from highest accessibility to the lowest):

Declared accessibility	Meaning
public	Access is not restricted.
protected	Access is limited to the containing class or types derived from the containing class.
internal	Access is limited to the current assembly.
protected internal	Access is limited to the current assembly or types derived from the containing class.
private	Access is limited to the containing type.

One thing to note is not to confuse accessibility with security. Access is granted to code scopes and not to persons or identities when we assign certain accessibility modifiers to classes or members. Also, although most members can have any access level, but usually, there is a scheme that goes on in most cases:

Member Type	Usual Accessibility levels (ordered from most usual)
Field	private, protected, protected internal
Property	public, internal
Method	public, protected, private, protected internal, internal
Events	public

Here's the link to the accessibility modifiers in C# in MSDN:

Access Modifiers (C# Reference)

Wednesday, June 25, 2008

An Introduction to Designing Classes in C# - Pt.1

Most books on C# (and programming in general) will usually start with the infamous "hello world" example. Although this approach is dominant in most trainings and books, I believe it can be misleading. In my opinion, an abstract introduction to the language and the programming framework is necessary before writing any code units that really works. With object oriented programming, the situation becomes tricky! Especially with a fully object oriented programming language such as C#. In this post, I try to address some concepts that I feel must be introduced abstractly before actually beginning to learn how to write code. Now, the topics I will discuss are mere introductions in most cases and they demonstrate the concepts without tying them to a specific Programming Language (PL). I put the topics in the form of question and answer, and these topics are mainly inspired by the questions I've been asked by students during training courses I gave throughout the years especially in the early introductory phases of the courses.

What is it to declare or define a class?

We mentioned earlier that classes describe types/categories of things, more vaguely, it tells us what a single thing/object of this category will look like.
To do that, a class should list the features of objects within the class. In programming, the features of an object consists briefly of the data that describes the object and the operations that the object can do. So when we define/declare classes (I personally prefer define, however declare is the most commonly used word), what we usually do is write code units that will list the features of objects created from that class. That is, when you define a class, you should use the syntax elements your PL offers to describe and list every single feature(informational and operational) of that class.
Let's take a simple example: suppose we are designing a software for mail-courier services. The software requires at some point that you represent the mail packages that the company is dealing with. The category of objects/things that we want to represent is MailPackage objects. Congratulations, you found the name of the class you will design! Now the hard part: listing features.
To list the features of a mail package we need first to identify them (that's what we call analyzing the problem to design a solution!) As we mentioned earlier, we divide features into informational/data-related features and operational features, so let's do that!

Informational Features:
First, let's see what informational features does a mail package have. At this point, we re-emphasize the importance of the context we are developing within. You need to identify what data features that a mail package has and that is relevant to our case (mail-courier services). In other words, you need to "identify what kind of data do a mail-courier service need to store about each mail pacakge" - simple, isn't it!
Let's say that after interviewing every one in the company, their wives and their children, you found out that this is what is important to the work of the company:
Size of package: in terms of width, height and depth.
Volume of package: in meter cubed
Weight of package: in kgs rounded to the nearest integer
Order number: a serial number generated for each order that contains letters and numbers
Breakable, OneSide, Sensisitive, Hazardeous, Sealed: are all flags that can be yes or no
Packaging: one of (PaperBag, PlasticBag, CartoonBox, WoodBox, MetalBox, PlasticBox, Safe)
As you can see some of the features (let's call them properties or attributes) are calculated (Volume) and some are compund (Size). Keep that in mind for later use.

Operational Features:
Now let's take a look at what operational features does a mail package posses. When doing so, we are looking for "what mail packages can do?" Again, this should be within the context we are working within. Identifying operations can be tricky, and it much depends on the situation, but usally a good approach is generally to look for "verbs" related to the object at hand within you analysis papers. For example, you might find the following statement in one of the interview papers: "I should be able to find the order for any package that I know". Actually, the one with the information on the order is the package itself, so it is supposed to be responsible for finding its parent Order object. An operation is born: "FindParentOrder". Another type of operational features are "notifications". This enables an object of the class to notify other objects of certain things (status changes, value changes, user interactions, events in general). For example, a statement like this: "when an order is delivered, the owner should be emailed" in the interview makes a great candidate for a notification called OrderDelivered. Again, it depends on the situation but generally, notifications can be found from a statement that:

Is a rule of the form: "When ----- occurs, ------ should be done"
Is a rule of the form: "If ------ changes/occurs, ------ should change/be done"
Is a rule of the form: "In the case of /event of -------, ------- should be done"

Of course, again, you should tie this to the context of development. Notifications are called "events" in C# and the .NET framework in general

Tuesday, June 24, 2008

Reference Types and Value Types

Programming Language: C#
Type: Tutorial
Level: Beginners
Topic Category: General Programming
Main Series: Introduction to programming in C# and the .NET framework
Topic Title: Reference vs. Value Types

As we mentioned previously, in c#, everything is a class. However, there are five types of classes in C# and those are: class, struct, enum, delegate and interface. When I say they all are classes, that is in the .net framework, interfaces are actually abstract classes and that's how they are implemented in the level of MSIL (although text MSIL has a modifier that marks a class as an interface/struct ... etc, they are all declared as .class declarations)

C# allows three kinds of data types:

Reference Types
Value Types
Pointer Types (Yes C/C++ programmers, it is pointer types alright)

Pointer types are only allowed in unsafe contexts and are thus not managed by the CLR. Therefore we will not discuss them until we do unsafe code.

Value Types are types designed usually to hold single "values" as the name emplies. Now, when we say single, we don't mean a single simple type value. What we mean is that value type objects will hold a value rather than a real object. The destinction is unclear, I know, and it never will be because it's in the eyes of the programmer. For example a point might be thought of as a value, also size in terms of width, height and depth might be thought of as a comopund value. However, not all cases are as clear as the size case, consider a rectangle type, can we treat a rectangle as a value or should we treat it as an object. Value types however, have technical features that we will discuss later. struct and enum types are Value Types. All value type classes are automatically inherited (either directly or indirectly) from System.ValueType

Reference Types are types designed usually to hold objects that are "active" in some since. For example a Vehicle type will be needed for more than just to store the data of a car. Again, it is relative in some cases. class, interface, and delegate types are all reference types.

Now let's suppose we have a type DataType, and that we write the following code:

DataType p = new DataType();

Now, if we write the following line of code immediatly after the former:

DataType q = p;

The thing to note about reference types and value types is the way assignment works on both types. So, if DataType was a value type what will happen is that a copy of p will be assigned to q. That is, there will be actually two DataType objects in memory with the same value.
However, if DataType is a reference type, what will happen is what we call reference assignment. That is, the q reference will be assigned the same object as reference p making two references p and q point to the same object. This means only one DataType object will be in memory. Now changing p won't change q in the first case (value types). It will change both p and q in the second case (reference types).

Below is a comparison between reference and value types, characterizing the main features of each kind of types and the difference between them:

Reference Types	Value Types
Holds reference to an object in memory	Holds a value in memory
Assignment of two references will make them point to the same object in memory	Assignment of two value type objects will copy one to the other, keeping two separate objects in memory
Declaring a reference does not automatically create an object for it	Declaring a value type object automatically creates an object in memroy
class, interface, delegate	struct and enum
Usually used for active objects, those which will need to perform various operations and interact with other objects	Usually used for creating value holding types (types from which objects will be able to hold values)
Sample candidates: Vehicle, Product, SaleItem, Delivery, Queue, Student, GraphicalShape, GeometricalShape, DataBase, NetworkMessage, ChatControl, BlogController	Sample candidates: Size, Volume, Area, PersonName, PhoneNumber, EmailAddress, MailingAddress, IpAddress, IpConfigurations, BlogSettings
Can be assigned a null value (a reference pointing to nothing - actually null is called nothing in VB.NET)	Can not be assigned a null/nothing value unless made nullable

Sometimes we need to treat value types as reference types. This is done in an operation called boxing and it simply involves casting the value type value into an Object reference:

int i =10;
object o1 = i; //Implicit boxing
object o2=(object)i; // Explicit boxing

It is important to understand the difference between reference and value types to fully understand the difference between structs and classes and to know when to use which.

The following built-in types in C# are value types:

short, int, long, ushort, uint, ulong, byte, sbyte, float, double, decimal, bool, char, struct, enum

The default values for these types are 0's representation for numeral types, false for bool, the value produced by casting 0 to the enum type for enums, '\0' for char, the value produced by setting all value-type fields to their default values and all reference-type fields to null for structs

I included links to other materials/references/articles on reference and value types for more information:

Namespaces, Assemblies and Modules

Programming Language: C#, .NET in General
Type: Tutorial
Level: Beginners
Topic Category: General Programming
Main Series: Introduction to programming in C# and the .NET framework
Topic Title: .NET Namespaces, Assemblies and Modules

All .NET PLs share the same code cycle: Code is written, parsed, compiled into a binary MS Intermediate Language (MSIL) Assembly and then executed by the Common Language Runtime (CLR).

Simply, an assembly is a compiled .NET code. The .net framework has a common intermediate language that is as close to the Assembly language as possible which all .net PLs should compile their code into. An assembly is a binary MSIL file, thus a compiled .net code.
.NET assemblies come in three forms: Windows Executable (.exe), Console Executable (.exe) or Library (.dll)

.NET assemblies contain Modules, a piece of compiled code that is not a complete assembly. This technique is primarily for development teams with more than one person working on the same assembly. Each of them can work on a different module of the assembly and the modules are merged together to form the assembly later (this way, one assembly can be developed using more than one PL, each developer accomplishes his/her work on his favorite .net language and compiles the code to a module which is then merged with the other developers' modules to form the assembly). Now, usually, an assembly will contain only one module, but for accuracy's sake, we introduced the concept of modules.

The C# PL is a fully-object-oriented language, which means in colloquial terms "everything is a class". In .net terms, this means that C# modules can only contain class definitions, nothing more. Because assemblies are the binary MSIL binary compiled version of C# code files, this means that C# code files must not contain anything other than class definitions, and that is true. However, there is two more constructs that a C# code file can contain on its root level: namespace declaration and namespace using statements. Now, these are not memory consuming instructions like variable declaration nor are they code declaration/grouping constructs like methods. Namespaces are logical grouping of classes within a scope. They can be thought of as being prefixes to the class names of all the classes declared within them. They enable developers to create classes with globally unique names and at the same time help developers organize code within categories that you specify.

Let's take a look at a namespace declaration:

namespace GUI{
namespace DrawingShapes{
class ShapeBase{}
class Rectangle : ShapeBase{}
class Circle : ShapeBase{}
....
}
namespace Controls{
class MyControlBase{}
class MyButton : MyControlBase{}
}
namespace Text{
class MyTextBox{}
class MyRichTextBox{}
}
namespace Tools{
class Tool1{}
class Tool2{}
}
namespace InternalImplementation{
}
}
As you might have noticed, this is a clipping of a graphical user interface elements library. Now imagine two companies/developers working on the same project (GUI elements library). They both choose to develop a circle class, a rectangle class and so on. However, a developer wants to use both libraries together (may be the circle of the first company is drawn better than the one from the second company!) If there was no way to distinguish each class from the beginning, it will be impossible for the developer to use both libraries.
However, as we said, a namespace is a scope of the classes defined within it and it becomes a part of its name, so now the full name of the Circle class in this example becomes: GUI.DrawingShapes.Circle
Now, if every company/developer kept all his/her classes in a parent namespace of his choosing, it becomes nearly impossible that a naming collision will occur between his/her classes and other developers' classes. It also solves naming conflicts within the same library, for example, in an engineering design application library, you might want to develop a class that will represent the geometrical shape called Rectangle, and at the same time you want to represent a screen drawn Rectangle. These are two types sharing the same name.

Assemblies:

Binary compiled code files
Contain Modules that contain classes
When deciding how to divide classes on assemblies, you usually want to physically separate classes in separate files (for example, dictionary classes and drawing classes in a word processing application should be physically separated because they are both optional components of the application so they might and might not be included in the installation of the application)
You should always separate classes in different assemblies if they are:
- Different plug-ins/add-ins to a container application
- Two optional separate feature of the application
- Perform totally unrelated tasks
An assembly can contain classes from different namespaces

Namespaces:

Logical scope of classes
Classes from the same namespace can span multiple assemblies
Is used to solve naming conflicts in classes
Adds a prefix to the class name
Is used to group similar classes (usually by task)

Introduction to classes in C# and the .NET framework

What are classes and types?

Classes are code units that describe the attributes of interest of some real life object/thing to a software under development. It describes the data and data structures needed to define an instance (some arbitrary single object of the kind described by the class) of the class/type/kind and the operations that kind of objects can perform.

In other words, the class is a digital representation of a "kind of objects/things" or a "category of things". It’s not a digital representation of the "things" themselves. A digital representation of one of the things is an object/instance of that class!

For example, let’s consider a software that needs to represent a car electronically. Of course a car represented digitally will not be an exact copy of the real life car neither will it include all its features. Instead we see what actually is of concern to our software and represent it. That is, we usually represent real-life things in a certain "context" rather than representing them generally and thoroughly. The cars we want to represent are needed for the context of a software that car resellers can use, so the information of use will not include for example, the exact materials of which the car parts were manufactured, the list of brands of each component of the car neither the thorough dimensions or technical drawing of the car’s body.

Instead, let’s suppose that for the client of the software you are designing, what matters is the color, make, model, tire model, number of passengers, manufacturer and price of the car.
As we said earlier, classes represent kinds/categories not single instance things. So, we need to represent the type/kind/category of things called cars. A class will describe what kind of values are needed to represent the attributes of a car, but it will not give specific values for these attributes. Actually, the process of designing classes is simply the process of identifying the
attributes of the kind/type/category of things the class will represent!

So if we want to represent a car in the "context" we extracted from the customer, we will need to represent its color, its make, its model ... etc. The colored words are the names of the attributes/fields needed to define a single car. By "the attributes needed to define a car" we mean that to define a specific single car, we must give a value for each of these attributes. We will also need to define the type/kind of each attribute so as to be able to give it a value. For instance, a color can be represented by a number, a list of available color names and by three numbers each representing the amount of red, green and blue required to be mixed to get the color. We choose the representation that best fits our need.

You can think of classes as the means by which we describe types of things to the computer. For example, if we want to describe to the computer what points (as a kind of geometrical shapes) are, we tell it that for a point to be defined, it needs an x coordinate and a y coordinate both being numbers (integers if we work on an integral environment)
Now when we say that we want to define/create a specific point, we have to give the computer its specific x and y coordinates values.

The primitive/built-in/fundamental classes in C# and .NET

The C# programming language provides a set of what we call primitive/essential/fundamental/basic/simple types so as to use them in describing other types. Each of these types is a class like any other class you develop but it is built-in to the .NET framework and some of their names are reserved words in the C# language.

For example, there is a set of types that represents the different kinds of numerals in the real world (the short, int, long, float, double, decimal, ushort , unsigned short, uint, ulong types/classes)

The integer class itself (actually named Int32 in the .NET framework) describes integer numbers and an instance integer represents a single integer value. We declare a reference to an integer by writing:

int someInteger;

Then we must assign a value to the referenced object (actually for integers, it is given the value of 0 initially)

someInteger = 10;

Some of the classes that are considered simple/basic/primitive are:

short, int, long, ushort, uint, ulong, float, double, decimal, string, object, byte, void

Some of the more complex but still considered basic by many:

DateTime, TimeSpan, Version, Guid, DayOfWeek

Monday, June 23, 2008

Yet another notebook on programming ...

One thing every developer/programmer knows for sure is: you never know it all! In fact, we can safely say: you never know enough about programming!
Here's another fact about programming: problems never seize to exist! They just keep appearing and popping.
I've been working as a trainer for MCSD .NET for a while now (about 5 years) and I tried in so many ways to share my experience with others, and I FAILED. Not that I could not find content to share but in the contrary, there were too many that I never found the time to put it all up together in one place. Being programmer makes you a little verbose and a little perfectionist, so every time I tried to put up some material, I started categorizing and reasoning then starting from scratch. So for example if I wanted to write about printing documents using C#, I had to write about the classes you use, and then I found myself writing about classes in general and then about what classes do, and so I started from scratch all over again.
When I tried to write a series of articles on my blog, It took my five years and it is not finished yet!
I found out that the problem is that order of topics and reasoning was an obstacle to the old blog. You know, it's ironic that the world of the internet is going from hierarchial services to cloud structured services (for example, email, bookmarks, topics, etc. used to be categorized in a hierarchial folder structure back in the old days, then came the "TAG" magic and it al changed)
I decided to open up a new blog, fresh start, clean slate, where I can post topics as they come to my mind, no matter if they are related to each other or not, and no matter if they are tutorials, ideas, code samples, questions, suggestions, links and websites or any other form of thought as long as it is related to programming.

Now, I've used around 17 programming languages during my life (not counting the different flavors/versions of a language) but currently I'm kind of stable with using C# for desktop development (C++ and its flavors when C# is not an option), C# and PHP for web development, SQL Server 2005 (2000 if 2005 not an option) and MySQL for databases, Prolog for knowledge intensive AI applications and I love both open source and proprierty software (I appreciate unix and love MS)