GIS Research: January 2007

Wednesday, January 31, 2007

My first MsBuild task.

I frequently used MsBuild in my build scripts, but really didn't have a chance to actually write a task myself.

One particular task I face is to publish the files in a directory to the FTP server. So, I googled, and tried to find a solution for this. The first hit I got back is http://msbuildtasks.tigris.org/, which is a great community project. It includes a lot useful MsBuild tasks, include a FTP task. But what I wanted to FTP is a whole directory, not just single website. So, I started to write a task called FtpDirectoy, which will loop through all the qualified files, and ftp them onto the server.

The task is not difficult to write, basically, a new class has to be inherited from Microsoft.Build.Utilities.Task, then you set up a bunch of properties, finally, you have to override public override bool Execute(), and that will actually do the task.

The deployment is quite tricky, I copied the file at C:\Program Files\MSBuild\MSBuildCommunityTasks, and created a target file "CrystalGis.MsBuild.Tasks.Targets", which will then be included by the other project file.

Here the frustration comes, no matter what I do, the build task always told me that it couldn't find the task. Finally, I found an answer at http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=122927&SiteID=1

[Keith Hill:Try adding the full path to the project file to the SafeImports reg key in the registry. The location of the regkey is:
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\VisualStudio\8.0\MSBuild\SafeImports
I saw this problem just the other day and this fixed it. A better way to do this is to put your <UsingTask ...> in some <pick_a_name>.targets file and drop this targets file into a location that you can <Import ...> into your project file. Then put the full path to the targets file into the SafeImports section of the registry. This way, you only have to add the targets file once. The other way you would have to add each project file you use it in to SafeImports, well at least the ones you load into VS 2005. ]

After I added that entry,everything flies.

   public class FtpDirectoy:Task
   {
       #region Constructor

       public FtpDirectoy()
       {

       }

       #endregion


       #region Input Parameters

       private string _ftpAddrerss;

       /// <summary>
       /// Gets or sets the FTP addrerss.
       /// </summary>
       /// <value>The FTP addrerss.</value>
        public string FtpAddrerss
       {
           get { return _ftpAddrerss; }
           set { _ftpAddrerss = value; }
       }



       private string _username;

       /// <summary>
       /// Gets or sets the username.
       /// </summary>
       /// <value>The username.</value>
        public string Username
       {
           get { return _username; }
           set { _username = value; }
       }

       private string _password;

       /// <summary>
       /// Gets or sets the password.
       /// </summary>
       /// <value>The password.</value>
        public string Password
       {
           get { return _password; }
           set { _password = value; }
       }

       private bool _usePassive;

       /// <summary>
       /// Gets or sets the behavior of a client application's data transfer process.
       /// </summary>
       /// <value><c>true</c> if [use passive]; otherwise, <c>false</c>.</value>
        public bool UsePassive
       {
           get { return _usePassive; }
           set { _usePassive = value; }
       }


       private ITaskItem[] _files;

       /// <summary>
       /// Gets or sets the files to zip.
       /// </summary>
       /// <value>The files to zip.</value>
        [Required]
       public ITaskItem[] Files
       {
           get { return _files; }
           set { _files = value; }
       }

       private string _workingDirectory;

       /// <summary>
       /// Gets or sets the working directory for the zip file.
       /// </summary>
       /// <value>The working directory.</value>
       /// <remarks>
       /// The working directory is the base of the zip file. 
       /// All files will be made relative from the working directory.
       /// </remarks>
        public string WorkingDirectory
       {
           get { return _workingDirectory; }
           set { _workingDirectory = value; }
       }

       #endregion


       #region Task Overrides

       public override bool Execute()
       {
           Log.LogMessage(CrystalGis.MsBuild.Tasks.Properties.Resources.FtpDirectory, _workingDirectory);
           foreach (ITaskItem fileItem in _files)
           {
               string name = fileItem.ItemSpec;
               if (!File.Exists(name))
               {
                   continue;
               }
               //Ftp file here.......
                string remoteUrl=GetUriFromRemoteName(name);
               if (string.IsNullOrEmpty(remoteUrl))
                   continue;
               FtpFile(name, remoteUrl );
           }
           return true;
       }

       #endregion Task Overrides

       #region Private 

       /// <summary>
       /// Gets the name of the URI from remote.
       /// </summary>
       /// <param name="fileNameWithPath">The file name with path.</param>
       /// <returns></returns>
        private string GetUriFromRemoteName(string fileNameWithPath)
       {
           string pathWithoutWorkingDirectory;
           Log.LogMessage(Properties.Resources.GetUriForFile, fileNameWithPath);
           int index = fileNameWithPath.IndexOf(_workingDirectory);
           if (index > -1)
           {
               pathWithoutWorkingDirectory = fileNameWithPath.Substring(index+_workingDirectory.Length+1);
               return _ftpAddrerss +"/"+ pathWithoutWorkingDirectory.Replace("\\","/");
           }
           else
           {
               Log.LogError(Properties.Resources.FtpFileInvalid, fileNameWithPath);
               return string.Empty;
           }
       }

       /// <summary>
       /// FTPs the file.
       /// </summary>
       /// <param name="localFile">The local file.</param>
       /// <param name="remoteUri">The remote URI.</param>
        private bool FtpFile(string localFile, string remoteUri)
       {
           Log.LogMessage(Properties.Resources.FtpUploading, localFile, remoteUri);
           Uri ftpUri;
           if (!Uri.TryCreate(remoteUri, UriKind.Absolute, out ftpUri))
           {
               Log.LogError(Properties.Resources.FtpUriInvalid, remoteUri);
               return false;
           }

           FtpWebRequest request = (FtpWebRequest)WebRequest.Create(ftpUri);
           request.Method = WebRequestMethods.Ftp.UploadFile;
           request.UseBinary = true;
           request.ContentLength = localFile.Length;
           request.UsePassive = _usePassive;
           if (!string.IsNullOrEmpty(_username))
               request.Credentials = new NetworkCredential(_username, _password);

           const int bufferLength = 2048;
           byte[] buffer = new byte[bufferLength];
           int readBytes = 0;
           long totalBytes = localFile.Length;
           long progressUpdated = 0;
           long wroteBytes = 0;
           FileInfo localFileInfo = new FileInfo(localFile);
           try
           {
               Stopwatch watch = Stopwatch.StartNew();
               using (Stream fileStream = localFileInfo.OpenRead(),
                           requestStream = request.GetRequestStream())
               {
                   do
                   {
                       readBytes = fileStream.Read(buffer, 0, bufferLength);
                       requestStream.Write(buffer, 0, readBytes);
                       wroteBytes += readBytes;

                       // log progress every 5 seconds
                        if (watch.ElapsedMilliseconds - progressUpdated > 5000)
                       {
                           progressUpdated = watch.ElapsedMilliseconds;
                           Log.LogMessage(MessageImportance.Low,
                               Properties.Resources.FtpPercentComplete,
                               wroteBytes * 100 / totalBytes,
                               FormatBytesPerSecond(wroteBytes, watch.Elapsed.TotalSeconds, 1));
                       }
                   }
                   while (readBytes != 0);
               }
               watch.Stop();

               using (FtpWebResponse response = (FtpWebResponse)request.GetResponse())
               {
                   Log.LogMessage(MessageImportance.Low, Properties.Resources.FtpUploadComplete, response.StatusDescription);
                   Log.LogMessage(Properties.Resources.FtpTransfered,
                       FormatByte(totalBytes, 1),
                       FormatBytesPerSecond(totalBytes, watch.Elapsed.TotalSeconds, 1),
                       watch.Elapsed.ToString());
                   response.Close();
               }
           }
           catch (Exception ex)
           {
               Log.LogErrorFromException(ex);
               return false;
           }

           return true;
       }


 
       #endregion
   }

Monday, January 29, 2007

I finally decided to move on the new position.

The five and a half years experiences at DDTI are invaluable. I really have a lot of passion to devoted into my work at my first three years. I could stay overnight to fix a memory leak. I could try to improve a product even I am not assigned to do that.

I liked the people I work with, and apparently, they liked me too.

Yes, I am frustrated by some minor issues. I am not sure why I am not paid at a fair market value. I don't expect to be paid more than others because I don't think myself smarter than others. I tried NOT to take any of those attitudes to my work because I understand what is considered professional.

The surprising thing is that they did make an counter offer to try to let me stay. I am really surprised at how much I am actually appreciated at the final days of my work. They want to increase my salary more than what I have been increased in all five and a half years. I just don't feel comfortable to take this counter offer.

I think I am a loser in this process too. I have this value in the company, but can only have them pay me that value when I decided to leave.

I decided to move on, and I hope I can restore my passion to the software development.

Sunday, January 21, 2007

Merged two blogs together..

I am able to convert the old blogger account to the new blogger, and I merged the two accounts together. I didn't find an easy way to merge two blogger's, so I simply copy and paste the posts.

Those posts are not written today, they are just pasted today..

check in the test project in Google code.

Use some kind of source safe control is not only for backing up the source code, it can also show you the history of the project evolvement.

I have heard of the Google code before, but never tried it before. It uses the subversion to do the source control, which I have tried to check out a couple of open source project from SVN.

I have never tried to host my own project before. So, I want to give it a try.

The tools I am using TortoiseSVN, which also has a sibling called TortoiseSVN. Both are very handy tools.

Here are the steps I followed:

1. Create a project on google code.
2. Right Click the project folder , and select "Import " to check the source code into the google code repository. It will prompt for the url , use the http://**.**.**/trunck.
3. After the codes were imported into to the server, I tried to make some changes to the code, but it gave me an error (MKCOL405) when I tried to make some changes and commit the changes.
4. So, I deleted the original project, and tried to check out the project from the svn. It created a little folder (.svn) under the project folder. I believe it has all the information which the TortoiseSVN tool needed to communicate with the server.
5. Now, it seems that I can make the changes, update the server, and compare the changes.

Start to review some college stuff.

For some reason, I found in day-to-day work, I focused on too much RAD (Rapid Application Development) , and I lost some ability to write some simple data structure code.

I guess many developers nowadays rely too much on libraries, without enough attention paid to how those libraries were actually develped.

I think I need some time to review some basic codes like how to write some basic data structure and their operations.
1. link list.
2. stack.
3. queue.
4. tree.
5. graph.

What is managed resources.

There have been enough posts about IDispose in .Net. The basically idea, to deterministically free up a resource, you need to implement the IDispose interface, also, it's better to offer a finalize method too.

Something like this:

MyResource:IDispose
{
~MyResource()
{
Dispose(false);
}

public void Dispose()
{
Dispose(true);
GC.SuppressFinalize(this);
}

protected virtual void Dispose(bool disposing)
{
if(disposing)
{//clean up managed resources.
}
//clean up unmanaged resources.
}

So, my question is : what exactly is the unmanaged resources?

Are GDI+ objectes considered unmanged resources ? How about sockets, other windows resources ... ?

David Kline has a good example here :

using System;
using System.Diagnostics;
using System.Runtime.InteropServices;

namespace Snippet
{
///
/// Wrapper for the Win32 Device Context (DC)
///
public class DeviceContext : IDisposable
{
// the Win32 Device Context (DC) object
private IntPtr hDC;

// window handle associated with the DC
private IntPtr hWnd;

///
/// Constructor.
///
///
/// The window handle for which to retrieve the device context
///
public DeviceContext(IntPtr hwnd)
{
// call the p/invoke to get the device context for the specified window handle
IntPtr hdc = GetDC(hwnd);

// verify that the GetDC call succeeded
if(hdc == IntPtr.Zero)
{
throw new Exception("Failed to get the DeviceContext for the specified window handle");
}

// store the window handle and device context for future reference
this.hWnd = hwnd;
this.hDC = hdc;
}

///
/// Finalizer
///
~DeviceContext()
{
// dispose the object (unmanaged resources)
this.Dispose(false);
}

///
/// Cleanup the object (implementation if IDisposable::Dispose)
///
public void Dispose()
{
// clean up our resources (managed and unmanaged resources)
this.Dispose(true);

// suppress finalization
// the finalizer also calls our cleanup code
// cleanup need only occur once
GC.SuppressFinalize(this);
}

///
/// Cleanup resources used by the object
///
///
/// Are we fully disposing the object?
/// True will release all managed resources, unmanaged resources are always released
///
protected virtual void Dispose(Boolean disposing)
{
if(disposing)
{
//*** release any managed resources
}

// release unmanaged resources
if(this.hDC == IntPtr.Zero)
{
// we're already been disposed, nothing left to do
return;
}
Int32 ret = ReleaseDC(this.hWnd, this.hDC);
this.hDC = IntPtr.Zero;
this.hWnd = IntPtr.Zero;

// assert if the DC was not released.
Debug.Assert(ret != 0,
"Failed to release DeviceContext.");
}

//*** add desired p/invoke definitions
}
}

Based on my experiences, most of the unmanaged resources have been already wrapped inside .Net class, and those are really should be treated like managed resources. For example, those objects defined in System.Drawing like bitmaps, fonts, meshes and textures, they are managed objects containing reference to unmanged resources.

I have a windows service program which depends a sql database. Sometimes, when the machine starts and my program tries to connect the database, the da

I have a windows service program which depends a sql database. Sometimes, when the machine starts and my program tries to connect the database, the database is not fully initialized yet, so it will throw an error in the startup procedure, and the service won't start.

I tried to a couple of solutions :

1> The first one I tried to is to set the recovery options of the service property, set it to restart the service if it fails. I actually misunderstand this property until I got an reply from a msdn newsgroup:

>
>> Recover options are only relevant if your service executable crashes, not
>> because of any HRESULTs which might lead to ending a process. If a
>> process is just ending by leaving all thread functions that is pretty OK
>> for the service control manager.
>>
>> You could try something like int *p = NULL; *p = 42; that should crash
>> your service and the restart should occur after the configured time
>> interval. Defaults to 1 minute IIRC.

2> Another solution I tried is to create a database watching thread. Instead of aborting from the startup procedure, the watching thread will keep checking the database connection, and notify the main thread when the database is available.

This approach is working, but it's a little bit overkilling. Everybody knows that the thread should really be avoided unless it's absolutely needed. In my case, if the database is not connected, my program is not workable any way, since I need the information from database to perform a lot of things...

The final solution I come out is putting the database checking in a while loop, which will check the database connection a configurable times (10 times by default) at start up in every 10 seconds.

This solution has 2 advantages:
1> The start up procedure of my service won't abort immediately if the database is not available. And in my cases, most of times, the database is available, it's just a little bit slow to come back.
2> If it really failed after trying to connect to the database after certain attempts, it may need some attention anyway.

What makes a good programmer.

Having been working as a computer programmer for the past 5 years, I have some thoughts about what makes a good programmer.

1>design patterns

I think it's the far most important one, the life of the software depends on how easy you can maintain the software, how easy you can add a new feature, how easy you can modify an existing feature. Every programmer can write some codes, but the difference is how they organize the codes. The seasoned programmer always bears in mind how s/he can maintain it when s/he implements some codes.

2>Language Features

Familiarity of the languages, including the libraries associated with the particular languages. There are a lot of languages available, C#, C++, Java..., and nobody could be very familiar with each of them, just as nobody could speak every language in the world. But a good programmer has to be very familiar with one or two languages s/he uses in day to day work. Efficiency also comes from this, if you know the language feature well, it's easier for you write clean code.

I often read code I wrote when I just started to work 5 years ago, and I feel it's ugly, and a lot of them are NOT very sophisticated. Sometimes, an experienced programmer can write two lines of codes which an inexperienced programmer may need write more than 10 lines of code.

3> Algorithms

That's the part I missed all. In day to day work, a lot of focuses are on RAD (Rapid Application Development). Most of time, you don't have to think about too much on the underlying algorithm, you'll just find some data structures in the library, rarely you'll have to write something really serious. This kind of work will make smart people dumb because they don't get chance to think deeply.

4> Resource Handling

Computer is a shared resource. The computer your program is running also includes a lot of other programs. It means, if you want to use certain resources , you are borrowing some resource from the computer, after using it, you want to give it back to the computer. If you don't , it means other people won't be able to use it.

The situation has been improved steadily with new technologies or research efforts underway. If you work in a managed world like C# or Java, you have less worry since GC did a lot of work for you. But still, there are a lot of resource the GC won't handle for you automatically, or in other word, you cannot rely on GC to take care for you.

5> Tools

Making good use of the tools. I know some geeks like to program in notepad, and open a command window to do everything. I admire those people since it reflects that those people really understand what's under the hood. And a lot of times, they can do thing more efficiency.

For me, like most other people, I like powerful IDEs like VS.Net and Eclipse, even though I don't have much experience in the latter one. Jon Skeet has a very good post about those two IDEs. I always try not to rely more on shortcuts instead of the mouse in developing. I feel it can improve efficiency much.

By meaning tools, I also mean those sdk tools, and a lot of third party add-ons. James Avery has a MSDN article about ten VS.Net essential tools. By making full use of those tools, you can really show a difference with those co-workers who don't use those tools.

GMap asp.net control , geocoding, and asynchronous page.

I have been playing the GMap asp.net control recently, and loved it. I didn't have much experience in asp.net server control, and this gave me a good start to get into some details.

I wanted to use the geocoder functionalities of the GMap, and did some simple research on that. I firstly tried to use some call back functions. In the class "GServerGeoCoder" I created, i have a match function which sent a Http request over to the server , like this:

public void Match()
{
IAsyncResult result = (IAsyncResult)geoCoderHttpWebRequest.BeginGetResponse(new AsyncCallback(RespCallback), geoCoderRequestState);
}

And in the RespCallBack, I would raise an event GeoCoderResponeReceived which the aspx page registered. I thought it was a good approach, but the problem is that the server control won't render itself in the callback.

The thread which processed the geocoder request would go ahead to finish the rendering, and it won't wait for the call back result to finish the rending process. And the call back to the aspx page won't cause the server control render itself again. I assume there is a way to do it, but I didn't find it out so far.

Then, I searched some other ways and found out that asp.net actually has a way to do it called "Asynchronous Pages" ,and this is exactly what I looked for. By registering some PageAsyncTask, [Start quoting from Jeff Prosise ]"The page undergoes its normal processing lifecycle until shortly after the PreRender event fires. Then ASP.NET calls the Begin method that you registered using AddOnPreRenderCompleteAsync, Furthermore, the Begin method returns an IAsyncResult that lets ASP.NET determine when the asynchronous operation has completed, at which point ASP.NET extracts a thread from the thread pool and calls your End method. After End returns, ASP.NET executes the remaining portion of the page's lifecycle, which includes the rendering phase." [End quoting]

The whole idea is that we need to postpone the rending process until we get the result back.

The magic yield return , Tree enumerators.

The yield return is a magic feature introduced in C# 2.0, it makes life lot easier to enumerate the collection. I was working on a tree collection recently, and it involves doing a BFS and DFS.

Here is the sample code..

      /// <summary>

       /// Gets the depth first node enumerator.

       /// </summary>

       /// <value>The depth first node enumerator.</value>

       public IEnumerable<DTreeNode<T>> DepthFirstNodeEnumerator

       {

           get

           {

               yield return this;

               if (m_Nodes != null)

               {

                   foreach (DTreeNode<T> child in m_Nodes)

                   {

                       IEnumerator<DTreeNode<T>> childEnum = child.DepthFirstNodeEnumerator.GetEnumerator();

                       while (childEnum.MoveNext())

                           yield return childEnum.Current;

                   }

               }

           }

       }



       /// <summary>

       /// Gets the breadth first node enumerator.

       /// </summary>

       /// <value>The breadth first node enumerator.</value>

       public IEnumerable<DTreeNode<T>> BreadthFirstNodeEnumerator

       {

           get

           {

               Queue<DTreeNode<T>> todo = new Queue<DTreeNode<T>>();

               todo.Enqueue(this);

               while (0 < todo.Count)

               {

                   DTreeNode<T> node = todo.Dequeue();

                   if (node.m_Nodes != null)

                   {

                       foreach (DTreeNode<T> child in node.m_Nodes)

                           todo.Enqueue(child);

                   }

                   yield return node;

               }

           }

       }

The point of iterators is to allow the easy implementation of enumerators. Where a method needs to return either an enumerator or an enumerable class for an ordered list of items, it is written so as to return each item in its correct order using the ‘yield’ statement.

The magic yield will keep a state information in the enumeration, so it always remembers where to start in the next loop.

The trip.

I have been on vacation since last week, and traveled Phoenix, Las Vegas, San Francisco, Los Angeles, and San Diego. It was a great trip, partly because I really haven't been in those parts of the country that much.

The best city in my view is San Francisco, including its suburbs. The weather is so nice, comparing with the cold Columbus. I was told the summer is also very nice.

I also got to meet a lot of friends I haven't seen for years. They are doing very well, and I am really happy for them.

Now, I am back in Phoenix, and will spend most of the time in hotel and have some time to read some books.

I had a new friend in this trip - Microsoft Streets and Trips with the GPS receiver. I am very happy with it. The problem with the traditional maps is that sometimes, it's harder to locate where you are, but with the GPS, you are always aware the current location. That makes a big difference. And the map is pretty accurate. The only time I have some issue was when I drove in the skyscrapers in SF, the GPS signal seemed to be blocked by the buildings a little bit.

Implement a queue using an array...

I have been reviewing some basic data structure lately. The queue is one of the basic data structure, just as other data structure, you can use the array or the link list to implement it.

When using an array to implement the the queue, there are two pointers, one points to the head, and the other points to the end. If we have array of size N to implement an list with N elements. We will have 0<=head<N, and 0<=tail<N, and the difference between head and tail satisfies 0<=(head-tail)mod N <N. There are only n distinct differences, with queue length 0, 1, 2, ... , N-1. But the queue could have N elements too, it means that the number of the enumerations(N+1) of the queue is larger than what the underlying array can offer.

Actually, when the array is empty (the queue has 0 element), the head and the tail points to the same position, and when the array is full, the head and the tail points to the same position too. So, we cannot tell whether the list is empty of full solely based on the head and tail position.

For example, when the queue is just created, the head points to 0 piston, and the tail points to the N-1 position in the array, and when we add the first element, the tail position will move to the first element. If we keep adding 10 elements, then we get to a point the tail position will point to the 10th element again. Then the head and tail position actually points to the exact same position as the empty queue.

There are two options for dealing with this problem. The first is to limit the number of elements in the queue to be at most n-1. The other is to use another member variable , count, to keep tract explicitly of the actual number of elements in the queue rather than to infer the number from the head and tail variables. Both are valid options.

In the first option, we let the tail point to the 0 element, and the head point to the N position. So, if we move the head right, then we'll reach the tail, that's the empty situation. On the other hand, if we have a full queue, we move the tail right, we'll reach the head.

The empty condition is happening when head/N==tail/N. And the full condition is happening tail+1==head.

The vm size on the task manager is confusing.

Everybody who has taken a course of Operating system knows the difference between the virtual memory and physical memory. There is a good definition here : Virtual (or logical) memory is a concept that, when implemented by a computer and its operating system, allows programmers to use a very large range of memory or storage addresses for stored data. The computing system maps the programmer's virtual addresses to real hardware storage addresses. Usually, the programmer is freed from having to be concerned about the availability of data storage.

And the physical memory is the memory actually used by the process right now. So normally, you would think that virtual memory used by one process should be larger than its physical memory.

But if you open the task manager,and compare the size of the virtual memory and physical memory, you will get a different answer.

The vm size is actually smaller than the physical memory size. The "VM size" used by the task manager is pretty misleading.

The VM is actually the "Private Bytes" used by the popular process explorer, which are part of the VM, but they exclude the memory allocated by the system. The process explorer depicts a much better picture of the system state.

Using FxCop

I have downloaded the FxCop long time ago, but really didn't use it too much until recently.

I think every .Net programmer really should run it on a routine basis. It's just too important to ignore. When I firstly ran it on my utilities project, it gave me over 3400 warnings, while the compiler even didn't give me a single warning.

I am not saying all those warnings are valid. One of my text file which is embedded in the resource is a soundex index file, and it has a lot of unrecognized words in that. So, it gives a lot of errors in the Microsoft.Naming. After I remove the files , the errors dropped about half. Still, it's quite a lot.

I spent about 10 hours go through those warnings, and a lot of those warnings are valid warnings, such as IDispose are not implement correctly, explicit interface implementations should be changed to conform to the standards.

After I go through those warnings line by line, it teaches me a lot what I should code according to the standard. A lot of stuff which I normally never pay attention to actually deserve a lot of attention.

Good stuff, strongly recommended.

GIS Research