Version: 4.1
Notes for Developers of the WeeWX Weather System

This guide is intended for developers contributing to the open source project WeeWX.

Goals

The primary design goals of WeeWX are:

Strategies

To meet these goals, the following strategies were used:

While WeeWX is nowhere near as fast at generating images and HTML as its predecessor, wview (this is partially because WeeWX uses fancier fonts and a much more powerful templating engine), it is fast enough for all platforms but the slowest. I run it regularly on a 500 MHz machine where generating the 9 images used in the "Current Conditions" page takes just under 2 seconds (compared with 0.4 seconds for wview).

All writes to the databases are protected by transactions. You can kill the program at any time (either Control-C if run directly or "/etc/init.d/weewx stop" if run as a daemon) without fear of corrupting the databases.

The code makes ample use of exceptions to insure graceful recovery from problems such as network outages. It also monitors socket and console timeouts, restarting whatever it was working on several times before giving up. In the case of an unrecoverable console error (such as the console not responding at all), the program waits 60 seconds then restarts the program from the top.

Any "hard" exceptions, that is those that do not involve network and console timeouts and are most likely due to a logic error, are logged, reraised, and ultimately cause thread termination. If this happens in the main thread (not likely due to its simplicity), then this causes program termination. If it happens in the report processing thread (much more likely), then only the generation of reports will be affected — the main thread will continue downloading data off the instrument and putting them in the database. You can fix the problem at your leisure, without worrying about losing any data.

Units

In general, there are three different areas where the unit system makes a difference:

  1. On the weather station hardware. Different manufacturers use different unit systems for their hardware. The Davis Vantage series use U.S. Customary units exclusively, Fine Offset and LaCrosse stations use metric, while Oregon Scientific, Peet Bros, and Hideki stations use a mishmash of US and metric.
  2. In the database. Either US or Metric can be used.
  3. In the presentation (i.e., html and image files).

The general strategy is that measurements are converted by service StdConvert as they come off the weather station into a target unit system, then stored internally in the database in that unit system. Then, as they come off the database to be used for a report, they are converted into a target unit, specified by the skin.

Value "None"

The Python special value None is used throughout to signal an invalid or bad data point. All functions must be written to expect it.

Device drivers should be written to emit None if a data value is bad (perhaps because of a failed checksum). If the hardware simply doesn't support it, then the driver should not emit a value at all.

The same rule applies to derived values. If the input data for a derived value are missing, then no derived value should be emitted. However, if the input values are present, but have value None, then the derived value should be set to None.

However, the time value must never be None. This is because it is used as the primary key in the SQL database.

Time

WeeWX stores all data in UTC (roughly, "Greenwich" or "Zulu") time. However, usually one is interested in weather events in local time and want image and HTML generation to reflect that. Furthermore, most weather stations are configured in local time. This requires that many data times be converted back and forth between UTC and local time. To avoid tripping up over time zones and daylight savings time, WeeWX generally uses Python routines to do this conversion. Nowhere in the code base is there any explicit recognition of DST. Instead, its presence is implicit in the conversions. At times, this can cause the code to be relatively inefficient.

For example, if one wanted to plot something every 3 hours in UTC time, it would be very simple: to get the next plot point, just add 10,800 to the epoch time:

next_ts = last_ts + 10800 

But, if one wanted to plot something for every 3 hours in local time (that is, at 0000, 0300, 0600, etc.), despite a possible DST change in the middle, then things get a bit more complicated. One could modify the above to recognize whether a DST transition occurs sometime between last_ts and the next three hours and, if so, make the necessary adjustments. This is generally what wview does. WeeWX takes a different approach and converts from UTC to local, does the arithmetic, then converts back. This is inefficient, but bulletproof against changes in DST algorithms, etc:

time_dt = datetime.datetime.fromtimestamp(last_ts)
delta = datetime.timedelta(seconds=10800)
next_dt = time_dt + delta
next_ts = int(time.mktime(next_dt.timetuple()))

Other time conversion problems are handled in a similar manner.

For astronomical calculations, WeeWX uses the latitude and longitude specified in the configuration file. If that location does not correspond to the computer's local time, reports with astronomical times will probably be incorrect.

Internationalization

Generally, WeeWX does not make much use of Unicode. This is because the Python 2.x libraries do not always handle it correctly. In particular, the function time.strftime() completely fails when handed a Unicode string with a non-ASCII character. As this function is often used by extensions, working around this bug is an unfair expectation on extension writers. So, we generally avoid Unicode.

Instead, WeeWX mostly uses regular strings, with any non-ASCII characters encoded as UTF-8.

An exception to this general rule is the image generator, which holds labels internally in Unicode, because that is the encoding expected by most fonts.

The document The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets by Joel Spolsky, is highly recommended if you are just starting to work with UTF-8 and Unicode.

Exceptions

In general, your code should not simply swallow an exception. For example, this is bad form:

    try:
        os.rename(oldname, newname)
    except:
        pass

While the odds are that if an exception happens it will be because the file oldname does not exist, that is not guaranteed. It could be because of a keyboard interrupt, or a corrupted file system, or something else. Instead, you should test explicitly for any expected exception, and let the rest go by:

    try:
        os.rename(oldname, newname)
    except OSError:
        pass

WeeWX has a few specialized exception types, used to rationalized all the different types of exceptions that could be thrown by the underlying libraries. In particular, low-level I/O code can raise a myriad of exceptions, such as USB errors, serial errors, network connectivity errors, etc. All device drivers should catch these exceptions and convert them into an exception of type WeeWxIOError or one of its subclasses.

Code style

Generally, we try to follow the PEP 8 style guide, but there are many exceptions. In particular, many older WeeWX function names use camelCase, but PEP 8 calls for snake_case. Please use snake_case for new code.

Most modern code editors, such as Eclipse, or PyCharm, have the ability to automatically format code. Resist the temptation and don't use this feature! Two reasons:

If you are working with a file where the formatting is so ragged that you really must do a reformat, then do it as a separate commit. This allows the formatting changes to be clearly distinguished from more functional changes.

When invoking functions or instantiating classes, use the fully qualified name. Don't do this:

from datetime import dt
now = dt()

Instead, do this:

import datetime
now = datetime.datetime()

Git work flow

We use git as the source control system.

We generally follow Vincent Driessen's branching model. Ignore the complicated diagram at the beginning of the article, and just focus on the text. In this model, there are two key branches:

What this means to you is that if you submit a pull request that includes a new feature, make sure you commit your changes relative to the development branch.

Tools

Python

Eclipse, with the PyDev Python extension, is highly recommended. It's free, easy to customize and extremely powerful.

JetBrain's PyCharm is also good, and now there's a free Community Edition. Where it really shines is if you use a framework such as Django, or Backbone, but WeeWX does not use any of these, so there is no real need for PyCharm's extra functionality when working with WeeWX.

HTML and Javascript

For HTML, JetBrain's WebStorm used to be the undisputed master. However, in recent years, I've found that Eclipse's "Web Development Tools" to be its equal, or even better, particularly when working with long HTML documents like the Customizing Guide.

However, if you are working with Javascript, particularly if you're using a framework like NodeJS or ExpressJS, there is no contest: WebStorm is the way to go.

Glossary

This is a glossary of terminology used throughout the code.

Terminology used in WeeWX
Name Description
archive interval WeeWX does not store the raw data that comes off a weather station. Instead, it aggregates the data over a length of time, the archive interval, and then stores that.
archive record While packets are raw data that comes off the weather station, records are data aggregated by time. For example, temperature may be the average temperature over an archive interval. These are the data stored in the SQL database
config_dict All configuration information used by WeeWX is stored in the configuration file, usually with the name weewx.conf. By convention, when this file is read into the program, it is called config_dict, an instance of the class configobj.ConfigObj.
datetime An instance of the Python object datetime.datetime. Variables of type datetime usually have a suffix _dt.
db_dict A dictionary with all the data necessary to bind to a database. An example for SQLite would be {'driver':'db.sqlite', 'root':'/home/weewx', 'database_name':'archive/weewx.sdb'}, an example for MySQL would be { 'driver':'db.mysql', 'host':'localhost', 'user':'weewx', 'password':'mypassword', 'database_name':'weewx'}.
epoch time Sometimes referred to as "unix time," or "unix epoch time." The number of seconds since the epoch, which is 1 Jan 1970 00:00:00 UTC. Hence, it always represents UTC (well... after adding a few leap seconds. But, close enough). This is the time used in the databases and appears as type dateTime in the SQL schema, perhaps an unfortunate name because of the similarity to the completely unrelated Python type datetime. Very easy to manipulate, but it is a big opaque number.
LOOP packet The real-time data coming off the weather station. The terminology "LOOP" comes from the Davis series. A LOOP packet can contain all observation types, or it may contain only some of them ("Partial packet").
observation type A physical quantity measured by a weather station (e.g., outTemp) or something derived from it (e.g., dewpoint).
skin_dict All configuration information used by a particular skin is stored in the skin configuration file, usually with the name skin.conf. By convention, when this file is read into the program, it is called skin_dict, an instance of the class configobj.ConfigObj.
SQL type A type that appears in the SQL database. This usually looks something like outTemp, barometer, extraTemp1, and so on.
standard unit system A complete set of units used together. Either US, METRIC, or METRICWX.
time stamp A variable in unix epoch time. Always in UTC. Variables carrying a time stamp usually have a suffix _ts.
tuple-time An instance of the Python object time.struct_time. This is a 9-wise tuple that represent a time. It could be in either local time or UTC, though usually the former. See module time for more information. Variables carrying tuple time usually have a suffix _tt.
value tuple A 3-way tuple. First element is a value, second element the unit type the value is in, the third the unit group. An example would be (21.2, 'degree_C', 'group_temperature').