TapVirtualisation of the desktop

As I explained in an earlier post I have a problem with the unreliability of Windows. In there I looked at all the applications I currently use and mooted the possibility of changing over to Linux and using a virtualisation system to ease the transition, or possibly permanently to resolve the problem of legacy systems that could not be ported.

Having now looked at it more deeply I see that if it was viewed from the opposite direction—embracing virtualisation for its own merits—then further benefits could be obtained. Normally I am in favour of a single tool for a single job. I apply this to software (not liking integrated suites) and Hi-Fi (preferring separates to music-centres) but I don’t seem to have applied it to the PC very much. They are sold as general purpose, do it all systems so, as a result, they have a huge mish-mash of applications installed. It is assumed that there are great benefits to this; the data from one system can easily be passed to another, but is this actually the case or even necessary? I have already isolated a few roles for other reasons; data storage and archive has been offloaded to a NAS and music playback has it’s own dedicated systems; so we can start from this position.

To look at it we need to see what the data flows are for typical regular tasks. I have analysed 5 regular jobs that I currently perform on my PC to discover interaction between them. These were 1) Rip/Digitise a music album, 2) publish a census transcript, 3) research and update the family tree, 4) prepare a projector schedule for a Sunday service and 5) update a web site. On top of this is the routine everyday activity of email, web browsing and editing text and word processed documents. Coincidentally, or perhaps by premonition, the tasks closely match the application groups that I used for the previous article.

Apart from a lot of interaction with email, browsing and documents, part of which was by copy and paste and part by manual retyping, I discovered that there was very little interaction between the jobs. Tasks 2 and 5 share a lot of data handling techniques but it was largely low level text editing and spreadsheet work which are core activities. Tasks 3 and 5 share a web upload requirement. If I had separate machines, the work could be divided between them so that each one performed a single function and there would be little data passed between them. Where virtualisation comes in is that it would not be necessary to have the expense, space or overhead of multiple hardware and software licences. It also offers the benefit of allowing cross pasting of data on the occasions when it would be useful. New versions of major applications can be tested in their own Virtual Machines (VMs). Finally, most virtualisation systems have the facility for suspended machines which make for a much faster setup as the typically used applications in each can already be running and just require data to be loaded.

A proposed design would be…

A Windows host platform

There has to be a base upon which the dedicated task VMs run. For reasons of economy this needs to be my existing desktop system as described in the previous post. Also for reasons of direct access to the hardware, some applications will need to remain on the host system. Many of these also have “Windows-only” constraints so that forces us to go for a Windows host operating system. For reliability (the reason we started this whole exercise off) we need to minimise what runs natively. Although it looks like there is a lot of stuff in this list, with the exception of the first two items, they are light and pose little threat.

  • Patchmix DSP interface to the sound card, the Matrox driver for the graphics card and ScanGear, the Twain driver for the SCSI scanner. These specialist drivers possibly pose the greatest threat to the stability of the machine as a whole.
  • Wave Corrector + LAME—probably needed here so that it has native access to the sound streams.

 

  • Pen Drive Manager and PINS. Other VMs will need access to the memory sticks and passwords.
  • Notetab & WinZip—because they will be useful.
  • Backup4all—to back everything up.
  • AVG—for safety.
  • ZoneAlarm—to protect the whole system.

The specialist task guest Virtual Machines

Legacy applications (those that have to run on Windows and for which there is no viable alternative) come in five groups each of which could be defined as a separate guest VM.

  • Presentation—EasyWorship + K-Lite + PowerPoint Viewer. I am not sure how EasyWorship will cope with a VM as it is really a dual screen program. ChipmunkAV could also be installed here for development as that is where it would belong. There is an opportunity here to create a VM which closely matches the live system used at church and it is a system that would benefit from a separate development environment for new versions.
  • Music processing—Exact Audio Copy + Accurate Rip + LAME, MediaMonkey and Audacity + LAME for convenience (even though a Linux version is available).
  • Family History—Family Tree Maker, Resource File Viewer, GED2HTML and DjVu as it is not used for anything else. Also these functions which are not used enough to warrant a separate VM.
    • GPS—Garmin Mapsource and waypoint manager. Our Advent GPS is managed from Mary’s laptop.
    • Web maintenance—Zoom possibly with CuteFTP because it is available and OmniPagePro for OCR. I haven’t seen any information about whether the Twain interface can be passed to guest VMs nor have also not seen any decent Linux OCR systems.

The remaining applications can be ported to Linux guest VM(s).

  • General home office—Firefox, Thunderbird + PopFile, a Jabber Client, OpenOffice, Adobe Reader, GnuPG, a Pop Peeper replacement and a picture editor.
    • An FTP system (initially, command line FTP would probably do)
    • A torrent client. Despite my putting µTorrent as “music download” in the earlier post I realised that the majority of my torrent traffic is not the typical music/video bootlegging that it is often used for, but mostly open software and data sharing. It would be run here because this VM is likely to be running most of the time. You could argue that it should be on the base host for this reason but I suspect that the Windows version may be contributing to the current machine instability.

    This VM would be used for all the routine email/browsing activity plus the census publishing and general web maintenance which uses similar tools (hence the FTP client). Turboprint will be needed for printing.

  • Software and Web development—A CVS system, Gnu C++, Home Grown Software, HTML Tidy. The VM gives the opportunity for running a local web server which would allow test and development of web applications such as WordPress locally.

That is 5 guest VMs—not all running at once of course but is this overkill?

Lingering Questions

  • Where do I keep the data? Application specific data would be better on the local VM but how, then would it be backed up? VMs use the concept of virtual disks for their operating systems and data. These are seen by the host system as just big unstructured files so backup from there is limited. Some also allow for the use of real disks or partitions but this creates the problem of drive fragmentation.
    Would it all be better on the NAS? It has the flexibility of floating sized share points which are visible from any machine. What are the implications for backup there—a NAS with live data would suggest the need for a backup and archive on a system separate again; at present there is only a weekly disk shadow.
  • Will the single Windows XP licence be sufficient for the host and multiple guest systems? MS Virtual PC presumably allows it. I am told that other VM systems work ok too.
  • Do guest VMs run under Windows Limited User account or do they need an Administrator host?
  • What issues are there for file sharing between Windows and Linux? Plain text files could be an issue with different line endings.
  • Is there any point in running the guest Windows VM’s at all? Why not just run all the legacy applications on the host platform.
  • Does this solution provide any benefits to justify its relative complexity? Will it make the overall system any more reliable and, in particular will it isolate the failures to just a single component?

The Future

Considering how little remains on the host, it suggests that the next time I need a machine replacement then, presuming that it is not actually broken, I should retain this machine for the sound card related applications and build the new one with the “General home office” configuration as the Linux host platform and run the other specialist VMs under that. The need for the digitising function should be quite low by then.

Appendix—the task analysis and data flows

  1. Rip/Digitise a music album.
    1. Ripping a CD. Exact Audio Copy reads the CD and passes the data internally to LAME and the MP3 is passed directly to the shared NAS. AccurateRip references its online dtabase. Media Monkey indexes the MP3, re-tags from internet sources and downloads the cover art. All data involved is already on shared media so there is no problem in devolving the processes.
    2. Digitising a vinyl or tape album. Wave Corrector records the audio stream to a temporary WAV file on local disk. As a separate process, Wave Corrector is used to edit and de-click the recording and pass the data internally to LAME and the MP3 is passed directly to the shared NAS. From there it is the same as the process above except that the cover art may be downloaded manually (using Firefox) and may need some picture editing. The temporary WAV file and the Wave Corrector log (session file) are archived to NAS.
  2. Publish a census transcript. The transcript arrives by email in a ZIP file. This is extracted using WinZip. The processing involves a mixture of SpreadSheet, a text editor and home written software to generate output HTML pages and a zipped archive. Email is used to distribute the results. As a secondary task, the browser is used to update a progress log (via Google Calendar) and IM is used for communication.
  3. Research and update the family tree. Input is by web search and email. The dedicated database is updated using Family Tree Maker. The web version is created by a piece of software that I had forgotten called GED2HTML and uploaded using FTP.
  4. Preparing a projector schedule. The order of service arrives by email as a word document. This is transferred, generally manually, into EasyWorship except. New songs and liturgy are cut and pasted either from the order of service or from a web based database. The output is transferred to memory stick for transport. A secondary task may involve obtaining/scanning images and editing them. These are imported into the EasyWorship system. Video clips and PowerPoint scripts are taken without modification, though may be viewed as a check.
  5. Updating a web site.
    1. The majority of this is done using a text editor (for the plain pages), or a browser (for the content managed pages). Secondary tasks include scanning and editing pictures and for some pages, data manipulation using OCR, spreadsheets etc. Server software comes in ZIP format. Upload is done using CuteFTP. The search engine index is created using Zoom which reads a local copy of the web pages and generates data files which are uploaded to the server – sometimes using an internal FTP process or sometimes using CuteFTP.
    2. Maintaining the online WordPress software and themes. This is done rather tediously using a text editor then FTP upload to a sandpit web site and test. It is a very slow form of software development.

3 Responses to “Virtualisation of the desktop”

References from other web pages (Pings and Trackbacks)

  1. Order of the Bath » Blog Archive » Migrating to Mac (Part 1)

^ Top