Different feed types and versions use wildly different date formats. Universal Feed Parser will attempt to auto-detect the date format used in any date element, and parse it into a standard Python 9-tuple, as documented in the Python time module.
The following elements are parsed as dates:
- feed.modified is parsed into feed.modified_parsed.
- entries[i].issued is parsed into entries[i].issued_parsed.
- entries[i].created is parsed into entries[i].created_parsed.
- entries[i].modified is parsed into entries[i].modified_parsed.
- entries[i].expired is parsed into entries[i].expired_parsed.
Here is a brief history of feed date formats:
- CDF states that all date values must conform to ISO 8601:1988. ISO 8601:1988 is not a freely available specification, but a brief (non-normative) description of the date formats it describes is available here: ISO 8601:1988 Date/Time Representations.
- RSS 0.90 has no date elements.
- Netscape RSS 0.91 does not specify a date format, but examples within the specification show RFC 822-style dates with 4-digit years.
- Userland RSS 0.91 states, “All date-times in RSS conform to the Date and Time Specification of RFC 822.” RFC 822 mandates 2-digit years; it does not allow 4-digit years.
- RSS 1.0 states that all date elements must conform to W3DTF, which is a profile of ISO 8601:1988.
- RSS 2.0 states, “All date-times in RSS conform to the Date and Time Specification of RFC 822, with the exception that the year may be expressed with two characters or four characters (four preferred).”
- Atom states that all date elements must conform to W3DTF.
Here is a representative list of the formats that Universal Feed Parser can recognize in any date element:
![link to this table [link]](images/permalink.gif)
Recognized Date Formats
Description | Example | Parsed Value |
---|---|---|
valid RFC 822 (2-digit year) | Thu, 01 Jan 04 19:48:21 GMT | (2004, 1, 1, 19, 48, 21, 3, 1, 0) |
valid RFC 822 (4-digit year) | Thu, 01 Jan 2004 19:48:21 GMT | (2004, 1, 1, 19, 48, 21, 3, 1, 0) |
valid W3DTF (numeric timezone) | 2003-12-31T10:14:55-08:00 | (2003, 12, 31, 18, 14, 55, 2, 365, 0) |
valid W3DTF (UTC timezone) | 2003-12-31T10:14:55Z | (2003, 12, 31, 10, 14, 55, 2, 365, 0) |
valid W3DTF (yyyy) | 2003 | (2003, 1, 1, 0, 0, 0, 2, 1, 0) |
valid W3DTF (yyyy-mm) | 2003-12 | (2003, 12, 1, 0, 0, 0, 0, 335, 0) |
valid W3DTF (yyyy-mm-dd) | 2003-12-31 | (2003, 12, 31, 0, 0, 0, 2, 365, 0) |
valid ISO 8601 (yyyymmdd) | 20031231 | (2003, 12, 31, 0, 0, 0, 2, 365, 0) |
valid ISO 8601 (-yy-mm) | -03-12 | (2003, 12, 1, 0, 0, 0, 0, 335, 0) |
valid ISO 8601 (-yymm) | -0312 | (2003, 12, 1, 0, 0, 0, 0, 335, 0) |
valid ISO 8601 (-yy-mm-dd) | 03-12-31 | (2003, 12, 31, 0, 0, 0, 2, 365, 0) |
valid ISO 8601 (yymmdd) | 031231 | (2003, 12, 31, 0, 0, 0, 2, 365, 0) |
valid ISO 8601 (yyyy-o) | 2003-335 | (2003, 12, 1, 0, 0, 0, 0, 335, 0) |
valid ISO 8601 (yyo) | 03335 | (2003, 12, 1, 0, 0, 0, 0, 335, 0) |
valid asctime | Sun Jan 4 16:29:06 PST 2004 | (2004, 1, 5, 0, 29, 6, 0, 5, 0) |
bogus RFC 822 (invalid day/month) | Thu, 31 Jun 2004 19:48:21 GMT | (2004, 7, 1, 19, 48, 21, 3, 183, 0) |
bogus RFC 822 (invalid month) | Mon, 26 January 2004 16:31:00 EST | (2004, 1, 26, 21, 31, 0, 0, 26, 0) |
bogus RFC 822 (invalid timezone) | Mon, 26 Jan 2004 16:31:00 ET | (2004, 1, 26, 21, 31, 0, 0, 26, 0) |
bogus W3DTF (invalid hour) | 2003-12-31T25:14:55Z | (2004, 1, 1, 1, 14, 55, 3, 1, 0) |
bogus W3DTF (invalid minute) | 2003-12-31T10:61:55Z | (2003, 12, 31, 11, 1, 55, 2, 365, 0) |
bogus W3DTF (invalid second) | 2003-12-31T10:14:61Z | (2003, 12, 31, 10, 15, 1, 2, 365, 0) |
Universal Feed Parser recognizes all character-based timezone abbreviations defined in RFC 822. In addition, Universal Feed Parser recognizes the following invalid timezones:
- AT is treated as AST
- ET is treated as EST
- CT is treated as CST
- MT is treated as MST
- PT is treated as PST