Evaluation of GNUmed

 Author:   David Mertz, Ph.D.
 Contact:  mertz@gnosis.cx / +1-413-824-9414
 Version:  1
Copyright: This document has been placed in the public domain.
  Date:    August, 2010

Contents

  • Background of Evaluation
  • Personal Background
  • Overview of Evaluation
  • Details on Source Code Evaluation
      □ Unified Printing System
      □ Documentation for developers
      □ Code Organization
      □ SQL

Background of Evaluation

In September 2009, Jim Busser -- a medical doctor who has
been a non-coding volunteer in the GNUmed project --
contacted the Python Software Foundation (PSF) Board of
Directors seeking advice in finding independent profesional
evaluation of the codebase of GNUmed. Dr. Busser expressed an
interest in performing due diligence to assure that the
projects design decisions have been made soundly, that the
code is meets professional software coding standards, and
that security issues have been adequately addressed in
protection of confidential data handled by GNUmed software.

Subsequent to this inquiry, as a then and current member of
the PSF Board, I forwarded the request to the general Python
Software Foundation membership, with an offer to Dr. Busser
to help vet any replies that arose there for competent
evaluators. Unfortunately, I wasn't able to locate any PSF
members interested and competent to work on such an
evaluation by these means; however, in light of this, I
volunteered to make such an evaluation myself, being
impressed with the goals of the project and usefulness of the
software after an overview evaluation. I performed this
evaluation in November of 2009, and have monitored the
general progress of the project since then.

I accepted a (somewhat nominal) payment for my time spent in
performing an evaluation, with the agreement that the
evaluation would be independent, and no a priori focus or
conclusion was directed by such payment. The goal expressed
by GNUmed project members I communicated with was that
anything I might find showing need for improvements would be
at least as valuable as an overview conclusion of design
soundness.

Personal Background

As well as my membership in the Python Software Foundation
and its Board of Directors, I serve as Chair of the PSF
Trademarks Committee, and have written a book and numerous
well-read articles about the Python programming language and
other computer technologies, including databases, computer
security and networking.

In a professional volunteer capacity, I served as the Chief
Technology Officer of the non-profit Open Voting Consortium
(OVC), which has advocated for and developed the use of open
source, secure and voter-transparent software for use in
governmental elections. Much of the software developed under
my supervision at OVC is written in Python.

In consulting positions I have worked with and evaluated a
variety of software systems, including large codebases,
written in Python and other programming languages. Additional
resume details of this evaluator are available upon request.

Overview of Evaluation

As is typical of open source software, the entirety of the
GNUmed code base is available for download and evaluation,
without restrictions, for any interested user. As an initial
step of the evaluation, I downloaded the current code base in
source form, as well as several packaged binary distributions
of the software. Being written in an runtime-interpreted
programming language -- Python -- a separate compilation step
is not required to run GNUmed; however, GNUmed authors have
packaged the source files into bundles for several operating
systems (including a self-booting CD distribution) for easier
evaluation and use. These bundled forms are not required for
use, but make resolution of dependencies and configurations
simpler, especially for non-technical users.

In my evaluation, I initially examined the runtime user
interface of GNUmed, including its help system. I am not
expert in the specific needs of medical records software, but
am broadly familiar with design of user interfaces. I found
the user interface to be simple to operate, contain a dense
but well structured presentation of record information, and
to avoid any apparent glitches or opaque functionality. Given
the inherent complexity of the task of medical record
management, I believe that lay users of the interface will
require at least several hours of training to become fully
familiar with the user interface of GNUmed. However, there
are no unnecessary surprises in its operation or organization
of information and I assume a similar training requirement
will exist for any other open source or proprietary software
in the same domain.

My evaluation of the user interface was peformed primarily to
familiarize myself with its overall operation to a degree
sufficient to understand the purpose of the source code that
implemented these functions. The bulk of my evaluation was in
examining that source code for typical design or coding flaws
that might be found in large and/or open source projects.

I was pleased to find the code base for GNUmed well
structured, cleanly coded, and free of any broad flaws that I
was able to detect. This is a strong project written by
highly competent and well-disciplined coders with a good
knowledge of Python programming practices and of general
database and security issues. I believe the existing releases
of GNUmed are stable and secure, and that the code
organization is such as to allow relatively easy extension
and customization as is needed for specific installations and
implementations.

Details on Source Code Evaluation

Following my review of the source code, I provided a set of
specific comments. Some additional points came out during
discussion with GNUmed developers, which I also address here.
I am not certain which of these issue have been mooted by
more current releases than I have examined carefully, but
none of them are "deal breaker" concerns, simply tweaks to
best coding practices.

Unified Printing System

Being written to run on multiple operating systems in
multiple configurations, producing consistent printed reports
is somewhat of a challenge that GNUmed had not fully
addressed as of November 2009. Technical options for printing
systems are discussed on the GNUmed Wiki.

Documentation for developers

  • A README in the top level directory would be a good place to put a section on "How to get started with editing source". Even a few paragraphs would help developers
    get oriented more quickly.
  • The check-prerequisites.(sh|py) tool is a surprisingly friendly introduction to setup. That's a nice touch.
  • The use of "Programmer notes" was a nice touch in the "GUI elements" page of the User Manual. Adding that other places would help a programmer encountering the
    software identify relations between functionality and code design. Of course, that could be intrusive for regular users, so some means of tucking away such notes
    would be good. Perhaps CSS to fold them away, or perhaps an "enhanced manual" that included them, with the end-user version generated by filtering out those
    sections.
  • The API documentation is good; however, it would be worthwhile to include these pages with the GNUmed tarball, or in a supplementary GNUmed-Docs.tgz archive. The
    salaam.homeunix.com site seems to be fairly poorly performing, and links to it are not as obvious as one would wish.

Code Organization

  • The directory organization is extremely inviting to developers. The separation of business/, wxpython/, and pycommon/ is very well done. I do not believe it would be
    particularly difficult to jump into the code to add a new feature.
  • When I first look at a set of business logic like this, tied to a database, the thought strikes me that perhaps the database operations should be factored out of the
    rest of the business logic. However, looking at this code, I am not inclined toward such further separation. Those methods that access the DB are primarily driven by
    those operations, with a minimum of other checks or calculations in the Python code, and these are quite readable as-is. Moreover, my own feeling is that something
    like an ORM is almost always more fragile than it is worth for obtaining more native-feeling source code. That applies here clearly, and the relational design is
    more robust than an ORM could capture.
  • What is actually working? This I find less clear than I would like to (as either a user or developer). For example in 0.5.1, the file gmMedication.py contains
    cConsumedSubstance(). The naming seems to provide a good indication of the function of the class. However, the bulk of the class is commented out in the
    implementation (I would guess because of some bug in its details; though the commented out code makes basic functional sense). The result, I presume is that allergy
    information cannot be modified in this version. A TODO or WHATSNEW in the source code tree could clarify this, though obviously takes work to maintain itself. I may
    be missing something functional along these lines within some other wiki page, issue/ticket tracker, or other out-of-stream documentation.
  • Unit tests: Many of the business/ modules have some unit tests built in to them. However, they do not seem to be systematic, and the results seem to need
    interpretation. For example, gmVaccination.py has quite a few tests, but all of them print out report-style values, but never raise explicit unit test failures if
    values are not as expected. It is possible that some called functions will raise exceptions in failure cases, but then it's a matter of tracing through call stacks
    rather than reading clearly encapsulated error reports.

      □ Moreover, it would be nice if the unit tests came in a defined suit that could be run with a "test_all" command. It might not take that much to cobble together a
        wrapper that called the various modules and tested the outputs against expectations. Having that would be very helpful to release-cycle management (or even to
        running nightly regressions).

SQL

  • Having an easy way to identify the table schemas would be a huge aid to new programmers. As is, there's nothing in the client code other than example of existing
    SELECT/INSERT statements. To see the organization, one needs to also grab the server code, and browse around sql/ directory. It would be really helpful -- even on an
    ongoing basis, not only as a first look at code -- to have schema diagram of the tables and fields available. There might be FOSS tools to generate these from the
    SQL, I'd have to look around to see the state of that. There are two levels of development where the schema diagrams would help:

     1. Answering the question, "Where should I put or find this data that my new feature wants to work with?" I.e. the table and field names (and any foreign key
        constraints I need to maintain).
     2. Answering the question, "Is there ANY existing table where my new feature can store its data? Do I need to change the schema to accommodate this new feature?"

  • Following on the salaam.homeunix.com site, and partially superseding the comment I make above about database schemas, I see there are schema descriptions at http://
    salaam.homeunix.com/~ncq/gnumed/schema/devel/. Finding this was non-obvious; I only came across it by playing with the linked URL for the API docs to guess at what
    else was on that site.
  • A minor code-style issue. I find it easier to scan SQL statements if they use consistent casing (even though SQL itself is case-insensitive). In particular, in my
    own code, I think it stands out nicely to put SQL keywords in CAPS, and table and field names in lowercase. The GNUmed source does this sometimes, but not
    consistently.

E.g. instead of:

cmd = u"""
      insert into dem.lnk_person_org_address(id_identity, id_address)
      values (%(id)s, %(adr)s)"""

It would scan nicely to use:

cmd = u"""
      INSERT INTO dem.lnk_person_org_address(id_identity, id_address)
      VALUES (%(id)s, %(adr)s)"""

Or even if the convention is all-lower, it should be consistent in the code.

  • Over-the-wire encryption is available with the sslmode=prefer option in the psycopg2 interface. That seems fine, though it might be nice to optionally allow SSH
    tunneling instead (which is outside the psycopg2 interface itself, of course).