This page describes the standard data type modules available in Python 3, and how to use them.

Overview

The modules described on this page provide a variety of specialized data types such as dates and times, fixed-type arrays, heap queues, synchronized queues, and sets.

  • Overview
  • datetime: Basic date and time
  • Calendar: working with calendar dates
  • Collections: container datatypes
  • Bisect: array bisection
  • Array: efficient arrays
  • Enum: enumerations
  • Derived enumerations
  • Python Overview
  • Linux commands help

Python also provides some built-in data types, in particular, dict, list, set and frozenset, and tuple. The str class is used to hold Unicode strings, and the bytes class is used to hold binary data.

Datetime: Basic date and time types

The datetime module supplies classes for manipulating dates and times in both simple and complex ways. While date and time arithmetic is supported, the focus of the implementation is on efficient attribute extraction for output formatting and manipulation. For related functionality, see also the time and calendar modules.

There are two kinds of date and time objects: “naive” and “aware”.

An aware object has sufficient knowledge of applicable algorithmic and political time adjustments, such as time zone and daylight saving time information, to locate itself relative to other aware objects. An aware object is used to represent a specific moment in time that is not open to interpretation.

A naive object does not contain enough information to unambiguously locate itself relative to other date/time objects. Whether a naive object represents UTC (Coordinated Universal Time), local time, or time in some other timezone is purely up to the program, like it is up to the program whether a particular number represents metres, miles, or mass. Naive objects are easy to understand and to work with, at the cost of ignoring some aspects of reality.

For applications requiring aware objects, datetime and time objects have an optional time zone information attribute, tzinfo, that can be set to an instance of a subclass of the abstract tzinfo class. These tzinfo objects capture information about the offset from UTC time, the time zone name, and whether Daylight Saving Time is in effect. Note that only one concrete tzinfo class, the timezone class, is supplied by the datetime module. The timezone class can represent simple timezones with fixed offset from UTC, such as UTC itself or North American EST and EDT timezones. Supporting timezones at deeper levels of detail is up to the application. The rules for time adjustment across the world are more political than rational, change frequently, and there is no standard suitable for every application aside from UTC.

The datetime module exports the following constants:

datetime Available Types

Objects of these types are immutable.

Objects of the date type are always naive.

An object of type time or datetime may be naive or aware. A datetime object d is aware if d.tzinfo is not None and d.tzinfo.utcoffset(d) does not return None. If d.tzinfo is None, or if d.tzinfo is not None but d.tzinfo.utcoffset(d) returns None, d is naive. A time object t is aware if t.tzinfo is not None and t.tzinfo.utcoffset(None) does not return None. Otherwise, t is naive.

The distinction between naive and aware doesn’t apply to timedelta objects.

Subclass relationships:

object timedelta tzinfo timezone time date datetime

datetime.timedelta Objects

A timedelta object represents a duration, the difference between two dates or times.

Class attributes are:

  • A millisecond is converted to 1000 microseconds.A minute is converted to 60 seconds.An hour is converted to 3600 seconds.A week is converted to seven days.

  • 0 <= microseconds < 10000000 <= seconds < 3600*24 (the number of seconds in one day)-999999999 <= days <= 999999999

from datetime import timedelta»> d = timedelta(microseconds=-1)»> (d.days, d.seconds, d.microseconds)(-1, 86399, 999999)

Note that, because of normalization, timedelta.max > -timedelta.min. -timedelta.max is not representable as a timedelta object.

Instance attributes (read-only):

Supported operations:

Notes:

  • This is exact, but may overflow.
  • This is exact, and cannot overflow.
  • Division by 0 raises ZeroDivisionError.
  • -timedelta.max is not representable as a timedelta object.
  • String representations of timedelta objects are normalized similarly to their internal representation. This leads to somewhat unusual results for negative timedeltas. For example:

timedelta(hours=-5) datetime.timedelta(-1, 68400) print(_) -1 day, 19:00:00

In addition to the operations listed above timedelta objects support certain additions and subtractions with date and datetime objects (see below).

Changed in version 3.2: Floor division and true division of a timedelta object by another timedelta object are now supported, as are remainder operations and the divmod() function. True division and multiplication of a timedelta object by a float object are now supported.

Comparisons of timedelta objects are supported with the timedelta object representing the smaller duration considered to be the smaller timedelta. To stop mixed-type comparisons from falling back to the default comparison by object address, when a timedelta object is compared to an object of a different type, TypeError is raised unless the comparison is == or !=. The latter cases return False or True, respectively.

timedelta objects are hashable (usable as dictionary keys), support efficient pickling, and in Boolean contexts, a timedelta object is considered to be true if and only if it isn’t equal to timedelta(0).

Instance methods:

from datetime import timedelta year = timedelta(days=365) another_year = timedelta(weeks=40, days=84, hours=23, … minutes=50, seconds=600) # adds up to 365 days year.total_seconds() 31536000.0 year == another_year True ten_years = 10 * year ten_years, ten_years.days // 365 (datetime.timedelta(3650), 10) nine_years = ten_years - year nine_years, nine_years.days // 365 (datetime.timedelta(3285), 9) three_years = nine_years // 3; three_years, three_years.days // 365 (datetime.timedelta(1095), 3) abs(three_years - ten_years) == 2 * three_years + year True

datetime.date Objects

A date object represents a date (year, month and day) in an idealized calendar, the current Gregorian calendar indefinitely extended in both directions. January 1 of year 1 is called day number 1, January 2 of year 1 is called day number 2, and so on. This matches the definition of the “proleptic Gregorian” calendar in Dershowitz and Reingold’s book “Calendrical Calculations”, where it’s the base calendar for all computations. See the book for algorithms for converting between proleptic Gregorian ordinals and other calendar systems.

Other constructors, all class methods:

  • MINYEAR <= year <= MAXYEAR1 <= month <= 121 <= day <= number of days in the given month and year

Class attributes:

  • date2 is moved forward in time if timedelta.days > 0, or backward if timedelta.days < 0. Afterward date2 - date1 == timedelta.days. timedelta.seconds and timedelta.microseconds are ignored. OverflowError is raised if date2.year would be smaller than MINYEAR or larger than MAXYEAR.
  • This isn’t quite equivalent to date1 + (-timedelta), because -timedelta in isolation can overflow in cases where date1 - timedelta does not. timedelta.seconds and timedelta.microseconds are ignored.
  • This is exact, and cannot overflow. timedelta.seconds and timedelta.microseconds are 0, and date2 + timedelta == date1 after.
  • In other words, date1 < date2 if and only if date1.toordinal() < date2.toordinal(). To stop comparison from falling back to the default scheme of comparing object addresses, date comparison normally raises TypeError if the other comparand isn’t also a date object. However, NotImplemented is returned instead if the other comparand has a timetuple() attribute. This hook gives other kinds of date objects a chance at implementing mixed-type comparison. If not, when a date object is compared to an object of a different type, TypeError is raised unless the comparison is == or !=. The latter cases return False or True, respectively.

Dates can be used as dictionary keys. In Boolean contexts, all date objects are considered to be true.

Example of counting days to an event:

import time from datetime import date today = date.today() today datetime.date(2007, 12, 5) today == date.fromtimestamp(time.time()) True my_birthday = date(today.year, 6, 24) if my_birthday < today: … my_birthday = my_birthday.replace(year=today.year + 1) my_birthday datetime.date(2008, 6, 24) time_to_birthday = abs(my_birthday - today) time_to_birthday.days 202

Example of working with date:

from datetime import date d = date.fromordinal(730920) # 730920th day after 1. 1. 0001 d datetime.date(2002, 3, 11) t = d.timetuple() for i in t:
… print(i) 2002 # year 3 # month 11 # day 0 0 0 0 # weekday (0 = Monday) 70 # 70th day in the year -1 ic = d.isocalendar() for i in ic:
… print(i) 2002 # ISO year 11 # ISO week number 1 # ISO day number ( 1 = Monday ) d.isoformat() ‘2002-03-11’ d.strftime("%d/%m/%y") ‘11/03/02’ d.strftime("%A %d. %B %Y") ‘Monday 11. March 2002’ ‘The {1} is {0:%d}, the {2} is {0:%B}.’.format(d, “day”, “month”) ‘The day is 11, the month is March.’

datetime.datetime Objects

A datetime object is a single object containing all the information from a date object and a time object. Like a date object, datetime assumes the current Gregorian calendar extended in both directions; like a time object, datetime assumes there are exactly 3600*24 seconds in every day.

Constructor:

  • MINYEAR <= year <= MAXYEAR1 <= month <= 121 <= day <= number of days in the given month and year0 <= hour < 240 <= minute < 600 <= second < 600 <= microsecond < 1000000

datetime(1970, 1, 1) + timedelta(seconds=timestamp)

  • datetime2 is a duration of timedelta removed from datetime1, moving forward in time if timedelta.days > 0, or backward if timedelta.days < 0. The result has the same tzinfo attribute as the input datetime, and datetime2 - datetime1 == timedelta after. OverflowError is raised if datetime2.year would be smaller than MINYEAR or larger than MAXYEAR. Note that no time zone adjustments are done even if the input is an aware object.
  • Computes the datetime2 such that datetime2 + timedelta == datetime1. As for addition, the result has the same tzinfo attribute as the input datetime, and no time zone adjustments are done even if the input is aware. This isn’t quite equivalent to datetime1 + (-timedelta), because -timedelta in isolation can overflow in cases where datetime1 - timedelta does not.
  • Subtraction of a datetime from a datetime is defined only if both operands are naive, or if both are aware. If one is aware and the other is naive, TypeError is raised. If both are naive, or both are aware and have the same tzinfo attribute, the tzinfo attributes are ignored, and the result is a timedelta object t such that datetime2 + t == datetime1. No time zone adjustments are done in this case. If both are aware and have different tzinfo attributes, a-b acts as if a and b were first converted to naive UTC datetimes first. The result is (a.replace(tzinfo=None) - a.utcoffset()) - (b.replace(tzinfo=None) - b.utcoffset()) except that the implementation never overflows.
  • datetime1 is considered less than datetime2 when datetime1 precedes datetime2 in time. If one comparand is naive and the other is aware, TypeError is raised if an order comparison is attempted. For equality comparisons, naive instances are never equal to aware instances. If both comparands are aware, and have the same tzinfo attribute, the common tzinfo attribute is ignored and the base datetimes are compared. If both comparands are aware and have different tzinfo attributes, the comparands are first adjusted by subtracting their UTC offsets (obtained from self.utcoffset()).Note: To stop comparison from falling back to the default scheme of comparing object addresses, datetime comparison normally raises TypeError if the other comparand isn’t also a datetime object. However, NotImplemented is returned instead if the other comparand has a timetuple() attribute. This hook gives other kinds of date objects a chance at implementing mixed-type comparison. If not, when a datetime object is compared to an object of a different type, TypeError is raised unless the comparison is == or !=. The latter cases return False or True, respectively.

datetime objects can be used as dictionary keys. In Boolean contexts, all datetime objects are considered to be true.

Examples of working with datetime objects:

def astimezone(self, tz): if self.tzinfo is tz: return self # Convert self to UTC, and attach the new time zone object. utc = (self - self.utcoffset()).replace(tzinfo=tz) # Convert from UTC to tz’s local time. return tz.fromutc(utc)

(dt - datetime(1970, 1, 1, tzinfo=timezone.utc)).total_seconds()

from datetime import tzinfo, timedelta, datetime»> class TZ(tzinfo):… def utcoffset(self, dt): return timedelta(minutes=-399)…»> datetime(2002, 12, 25, tzinfo=TZ()).isoformat(’ ‘)‘2002-12-25 00:00:00-06:39’

from datetime import datetime, date, time

Using datetime.combine()

d = date(2005, 7, 14) t = time(12, 30) datetime.combine(d, t) datetime.datetime(2005, 7, 14, 12, 30)

Using datetime.now() or datetime.utcnow()

datetime.now()
datetime.datetime(2007, 12, 6, 16, 29, 43, 79043) # GMT +1 datetime.utcnow()
datetime.datetime(2007, 12, 6, 15, 29, 43, 79060)

Using datetime.strptime()

dt = datetime.strptime(“21/11/06 16:30”, “%d/%m/%y %H:%M”) dt datetime.datetime(2006, 11, 21, 16, 30)

Using datetime.timetuple() to get tuple of all attributes

tt = dt.timetuple() for it in tt:
… print(it) … 2006 # year 11 # month 21 # day 16 # hour 30 # minute 0 # second 1 # weekday (0 = Monday) 325 # number of days since 1st January -1 # dst - method tzinfo.dst() returned None

Date in ISO format

ic = dt.isocalendar() for it in ic:
… print(it) … 2006 # ISO year 47 # ISO week 2 # ISO weekday

Formatting datetime

dt.strftime("%A, %d. %B %Y %I:%M%p") ‘Tuesday, 21. November 2006 04:30PM’ ‘The {1} is {0:%d}, the {2} is {0:%B}, the {3} is {0:%I:%M%p}.’.format(dt, “day”, “month”, “time”) ‘The day is 21, the month is November, the time is 04:30PM.’

Using datetime with tzinfo:

from datetime import timedelta, datetime, tzinfo class GMT1(tzinfo): … def utcoffset(self, dt): … return timedelta(hours=1) + self.dst(dt) … def dst(self, dt): … # DST starts last Sunday in March … d = datetime(dt.year, 4, 1) # ends last Sunday in October … self.dston = d - timedelta(days=d.weekday() + 1) … d = datetime(dt.year, 11, 1) … self.dstoff = d - timedelta(days=d.weekday() + 1) … if self.dston <= dt.replace(tzinfo=None) < self.dstoff: … return timedelta(hours=1) … else: … return timedelta(0) … def tzname(self,dt): … return “GMT +1” … class GMT2(tzinfo): … def utcoffset(self, dt): … return timedelta(hours=2) + self.dst(dt) … def dst(self, dt): … d = datetime(dt.year, 4, 1) … self.dston = d - timedelta(days=d.weekday() + 1) … d = datetime(dt.year, 11, 1) … self.dstoff = d - timedelta(days=d.weekday() + 1) … if self.dston <= dt.replace(tzinfo=None) < self.dstoff: … return timedelta(hours=1) … else: … return timedelta(0) … def tzname(self,dt): … return “GMT +2” … gmt1 = GMT1()

Daylight Saving Time

dt1 = datetime(2006, 11, 21, 16, 30, tzinfo=gmt1) dt1.dst() datetime.timedelta(0) dt1.utcoffset() datetime.timedelta(0, 3600) dt2 = datetime(2006, 6, 14, 13, 0, tzinfo=gmt1) dt2.dst() datetime.timedelta(0, 3600) dt2.utcoffset() datetime.timedelta(0, 7200)

Convert datetime to another time zone

dt3 = dt2.astimezone(GMT2()) dt3
datetime.datetime(2006, 6, 14, 14, 0, tzinfo=<GMT2 object at 0x…>) dt2
datetime.datetime(2006, 6, 14, 13, 0, tzinfo=<GMT1 object at 0x…>) dt2.utctimetuple() == dt3.utctimetuple() True

datetime.time Objects

A time object represents a (local) time of day, independent of any particular day, and subject to adjustment via a tzinfo object.

  • 0 <= hour < 240 <= minute < 600 <= second < 600 <= microsecond < 1000000.

  • comparison of time to time, where a is considered less than b when a precedes b in time. If one comparand is naive and the other is aware, TypeError is raised if an order comparison is attempted. For equality comparisons, naive instances are never equal to aware instances. If both comparands are aware, and have the same tzinfo attribute, the common tzinfo attribute is ignored and the base times are compared. If both comparands are aware and have different tzinfo attributes, the comparands are first adjusted by subtracting their UTC offsets (obtained from self.utcoffset()). To stop mixed-type comparisons from falling back to the default comparison by object address, when a time object is compared to an object of a different type, TypeError is raised unless the comparison is == or !=. The latter cases return False or True, respectively.

  • hash, use as dict key

  • efficient “pickling” (object serialization)

  • in Boolean contexts, a time object is considered to be true if and only if, after converting it to minutes and subtracting utcoffset() (or 0 if that’s None), the result is non-zero.

Example:

from datetime import time, tzinfo class GMT1(tzinfo): … def utcoffset(self, dt): … return timedelta(hours=1) … def dst(self, dt): … return timedelta(0) … def tzname(self,dt): … return “Europe/Prague” … t = time(12, 10, 30, tzinfo=GMT1()) t
datetime.time(12, 10, 30, tzinfo=<GMT1 object at 0x…>) gmt = GMT1() t.isoformat() ‘12:10:30+01:00’ t.dst() datetime.timedelta(0) t.tzname() ‘Europe/Prague’ t.strftime("%H:%M:%S %Z") ‘12:10:30 Europe/Prague’ ‘The {} is {:%H:%M}.’.format(“time”, t) ‘The time is 12:10.’

datetime.tzinfo Objects

tzinfo is an abstract base class, meaning that this class should not be instantiated directly. You need to derive a concrete subclass, and (at least) supply implementations of the standard tzinfo methods needed by the datetime methods you use. The datetime module supplies a simple concrete subclass of tzinfo timezone that can represent timezones with fixed offset from UTC such as UTC itself or North American EST and EDT.

An instance of (a concrete subclass of) tzinfo can be passed to the constructors for datetime and time objects. The latter objects view their attributes as being in local time, and the tzinfo object supports methods revealing offset of local time from UTC, the name of the time zone, and DST offset, all relative to a date or time object passed to them.

Special requirement for pickling: A tzinfo subclass must have an init() method that can be called with no arguments, else it can be pickled but possibly not unpickled again. This is a technical requirement that can relax in the future.

A concrete subclass of tzinfo may need to implement the following methods. Exactly which methods are needed depends on the uses made of aware datetime objects. If in doubt, implement all of them.

These methods are called by a datetime or time object, in response to their methods of the same names. A datetime object passes itself as the argument, and a time object passes None as the argument. A tzinfo subclass’s methods should therefore be prepared to accept a dt argument of None, or of class datetime.

If utcoffset() does not return None, dst() should not return None either. The default implementation of utcoffset() raises NotImplementedError.

An instance tz of a tzinfo subclass that models both standard and daylight times must be consistent in this sense:

tz.utcoffset(dt) - tz.dst(dt)

must return the same result for every datetime dt with dt.tzinfo == tz For sane tzinfo subclasses, this expression yields the time zone’s “standard offset”, which should not depend on the date or the time, but only on geographic location. The implementation of datetime.astimezone() relies on this, but cannot detect violations; it’s the programmer’s responsibility to ensure it. If a tzinfo subclass cannot guarantee this, it may be able to override the default implementation of tzinfo.fromutc() to work correctly with astimezone() regardless.

Most implementations of dst() will probably look like one of these two:

def dst(self, dt): # a fixed-offset class: doesn’t account for DST return timedelta(0)

or

def dst(self, dt): # Code to set dston and dstoff to the time zone’s DST # transition times based on the input dt.year, and expressed # in standard local time. Then if dston <= dt.replace(tzinfo=None) < dstoff: return timedelta(hours=1) else: return timedelta(0)

The default implementation of dst() raises NotImplementedError.

The default implementation of tzname() raises NotImplementedError.

When None is passed, it’s up to the class designer to decide the best response. For example, returning None is appropriate if the class wants to say that time objects don’t participate in the tzinfo protocols. It may be more useful for utcoffset(None) to return the standard UTC offset, as there is no other convention for discovering the standard offset.

When a datetime object is passed in response to a datetime method, dt.tzinfo is the same object as self. tzinfo methods can rely on this, unless user code calls tzinfo methods directly. The intent is that the tzinfo methods interpret dt as being in local time, and not need worry about objects in other timezones.

There is one more tzinfo method that a subclass may want to override:

Example tzinfo classes:

def fromutc(self, dt): # raise ValueError error if dt.tzinfo is not self dtoff = dt.utcoffset() dtdst = dt.dst() # raise ValueError if dtoff is None or dtdst is None delta = dtoff - dtdst # this is self’s standard offset if delta: dt += delta # convert to standard local time dtdst = dt.dst() # raise ValueError if dtdst is None if dtdst: return dt + dtdst else: return dt

from datetime import tzinfo, timedelta, datetime ZERO = timedelta(0) HOUR = timedelta(hours=1)

A UTC class.

class UTC(tzinfo): “““UTC””” def utcoffset(self, dt): return ZERO def tzname(self, dt): return “UTC” def dst(self, dt): return ZERO utc = UTC()

A class building tzinfo objects for fixed-offset time zones.

Note that FixedOffset(0, “UTC”) is a different way to build a

UTC tzinfo object.

class FixedOffset(tzinfo): “““Fixed offset in minutes east from UTC.””” def init(self, offset, name): self.__offset = timedelta(minutes=offset) self.__name = name def utcoffset(self, dt): return self.__offset def tzname(self, dt): return self.__name def dst(self, dt): return ZERO

A class capturing the platform’s idea of local time.

import time as _time STDOFFSET = timedelta(seconds = -_time.timezone) if _time.daylight: DSTOFFSET = timedelta(seconds = -_time.altzone) else: DSTOFFSET = STDOFFSET DSTDIFF = DSTOFFSET - STDOFFSET class LocalTimezone(tzinfo): def utcoffset(self, dt): if self._isdst(dt): return DSTOFFSET else: return STDOFFSET def dst(self, dt): if self._isdst(dt): return DSTDIFF else: return ZERO def tzname(self, dt): return _time.tzname[self._isdst(dt)] def _isdst(self, dt): tt = (dt.year, dt.month, dt.day, dt.hour, dt.minute, dt.second, dt.weekday(), 0, 0) stamp = _time.mktime(tt) tt = _time.localtime(stamp) return tt.tm_isdst > 0 Local = LocalTimezone()

A complete implementation of current DST rules for major US time zones.

def first_sunday_on_or_after(dt): days_to_go = 6 - dt.weekday() if days_to_go: dt += timedelta(days_to_go) return dt

US DST Rules

This is a simplified (i.e., wrong for a few cases) set of rules for US

DST start and end times. For a complete and up-to-date set of DST rules

and timezone definitions, visit the Olson Database (or try pytz):

http://www.twinsun.com/tz/tz-link.htm

http://sourceforge.net/projects/pytz/ (might not be up-to-date)

In the US since 2007, DST starts at 2am (standard time) on the second

Sunday in March, which is the first Sunday on or after Mar 8.

DSTSTART_2007 = datetime(1, 3, 8, 2)

and ends at 2am (DST time; 1am standard time) on the first Sunday of Nov.

DSTEND_2007 = datetime(1, 11, 1, 1)

From 1987 to 2006, DST used to start at 2am (standard time) on the first

Sunday in April and to end at 2am (DST time; 1am standard time) on the last

Sunday of October, which is the first Sunday on or after Oct 25.

DSTSTART_1987_2006 = datetime(1, 4, 1, 2) DSTEND_1987_2006 = datetime(1, 10, 25, 1)

From 1967 to 1986, DST used to start at 2am (standard time) on the last

Sunday in April (the one on or after April 24) and to end at 2am (DST time;

1am standard time) on the last Sunday of October, which is the first Sunday

on or after Oct 25.

DSTSTART_1967_1986 = datetime(1, 4, 24, 2) DSTEND_1967_1986 = DSTEND_1987_2006 class USTimeZone(tzinfo): def init(self, hours, reprname, stdname, dstname): self.stdoffset = timedelta(hours=hours) self.reprname = reprname self.stdname = stdname self.dstname = dstname def repr(self): return self.reprname def tzname(self, dt): if self.dst(dt): return self.dstname else: return self.stdname def utcoffset(self, dt): return self.stdoffset + self.dst(dt) def dst(self, dt): if dt is None or dt.tzinfo is None: # An exception may be sensible here, in one or both cases. # It depends on how you want to treat them. The default # fromutc() implementation (called by the default astimezone() # implementation) passes a datetime with dt.tzinfo is self. return ZERO assert dt.tzinfo is self # Find start and end times for US DST. For years before 1967, return # ZERO for no DST. if 2006 < dt.year: dststart, dstend = DSTSTART_2007, DSTEND_2007 elif 1986 < dt.year < 2007: dststart, dstend = DSTSTART_1987_2006, DSTEND_1987_2006 elif 1966 < dt.year < 1987: dststart, dstend = DSTSTART_1967_1986, DSTEND_1967_1986 else: return ZERO start = first_sunday_on_or_after(dststart.replace(year=dt.year)) end = first_sunday_on_or_after(dstend.replace(year=dt.year)) # Can’t compare naive to aware objects, so strip the timezone from # dt first. if start <= dt.replace(tzinfo=None) < end: return HOUR else: return ZERO Eastern = USTimeZone(-5, “Eastern”, “EST”, “EDT”) Central = USTimeZone(-6, “Central”, “CST”, “CDT”) Mountain = USTimeZone(-7, “Mountain”, “MST”, “MDT”) Pacific = USTimeZone(-8, “Pacific”, “PST”, “PDT”)

Note that there are unavoidable subtleties twice per year in a tzinfo subclass accounting for both standard and daylight time, at the DST transition points. For concreteness, consider US Eastern (UTC -0500), where EDT begins the minute after 1:59 (EST) on the second Sunday in March, and ends the minute after 1:59 (EDT) on the first Sunday in November:

UTC 3:MM 4:MM 5:MM 6:MM 7:MM 8:MM EST 22:MM 23:MM 0:MM 1:MM 2:MM 3:MM EDT 23:MM 0:MM 1:MM 2:MM 3:MM 4:MM start 22:MM 23:MM 0:MM 1:MM 3:MM 4:MM end 23:MM 0:MM 1:MM 1:MM 2:MM 3:MM

When DST starts (the “start” line), the local wall clock leaps from 1:59 to 3:00. A wall time of the form 2:MM doesn’t really make sense on that day, so astimezone(Eastern) won’t deliver a result with hour == 2 on the day DST begins. For astimezone() to make this guarantee, the tzinfo.dst() method must consider times in the “missing hour” (2:MM for Eastern) to be in daylight time.

When DST ends (the “end” line), there’s a potentially worse problem: there’s an hour that can’t be spelled unambiguously in local wall time: the last hour of daylight time. In Eastern, that’s times of the form 5:MM UTC on the day daylight time ends. The local wall clock leaps from 1:59 (daylight time) back to 1:00 (standard time) again. Local times of the form 1:MM are ambiguous. astimezone() mimics the local clock’s behavior by mapping two adjacent UTC hours into the same local hour then. In the Eastern example, UTC times of the form 5:MM and 6:MM both map to 1:MM when converted to Eastern. For astimezone() to make this guarantee, the tzinfo.dst() method must consider times in the “repeated hour” to be in standard time. This is easily arranged, as in the example, by expressing DST switch times in the time zone’s standard local time.

Applications that can’t bear such ambiguities should avoid using hybrid tzinfo subclasses; there are no ambiguities when using timezone, or any other fixed-offset tzinfo subclass (such as a class representing only EST (fixed offset -5 hours), or only EDT (fixed offset -4 hours)).

datetime.timezone Objects

The timezone class is a subclass of tzinfo, each instance of which represents a timezone defined by a fixed offset from UTC. Note that objects of this class cannot be used to represent timezone information in the locations where different offsets are used in different days of the year or where historical changes have been made to civil time.

strftime() and strptime() Behavior

date, datetime, and time objects all support a strftime(format) method, to create a string representing the time under the control of an explicit format string. Broadly speaking, d.strftime(fmt) acts like the time module’s time.strftime(fmt, d.timetuple()) although not all objects support a timetuple() method.

Conversely, the datetime.strptime() class method creates a datetime object from a string representing a date and time and a corresponding format string. datetime.strptime(date_string, format) is equivalent to datetime(*(time.strptime(date_string, format)[0:6])).

For time objects, the format codes for year, month, and day should not be used, as time objects have no such values. If they are used, 1900 is substituted for the year, and 1 for the month and day.

For date objects, the format codes for hours, minutes, seconds, and microseconds should not be used, as date objects have no such values. If they are used, 0 is substituted for them.

The full set of format codes supported varies across platforms, because Python calls the platform C library’s strftime() function, and platform variations are common. To see the full set of format codes supported on your platform, consult the strftime documentation.

The following is a list of all the format codes that the C standard (1989 version) requires, and these work on all platforms with a standard C implementation. Note that the 1999 version of the C standard added additional format codes.

  • Because the format depends on the current locale, care should be taken when making assumptions about the output value. Field orderings vary (for example, “month/day/year” versus “day/month/year”), and the output may contain Unicode characters encoded using the locale’s default encoding (for example, if the current locale is ja_JP, the default encoding could be any one of eucJP, SJIS, or utf-8; use locale.getlocale() to determine the current locale’s encoding).
  • The strptime() method can parse years in the full [1, 9999] range, but years < 1000 must be zero-filled to 4-digit width.
  • When used with the strptime() method, the %p directive only affects the output hour field if the %I directive is used to parse the hour.
  • Unlike the time module, the datetime module does not support leap seconds.
  • When used with the strptime() method, the %f directive accepts from one to six digits and zero pads on the right. %f is an extension to the set of format characters in the C standard (but implemented separately in datetime objects, and therefore always available).
  • For a naive object, the %z and %Z format codes are replaced by empty strings. For an aware object: %z: utcoffset() is transformed into a 5-character string of the form +HHMM or -HHMM, where HH is a 2-digit string giving the number of UTC offset hours, and MM is a 2-digit string giving the number of UTC offset minutes. For example, if utcoffset() returns timedelta(hours=-3, minutes=-30), %z is replaced with the string ‘-0330’.%Z: If tzname() returns None, %Z is replaced by an empty string. Otherwise, %Z is replaced by the returned value, which must be a string.
  • When used with the strptime() method, %U and %W are only used in calculations when the day of the week and the year are specified.

Calendar: working with calendar dates

This module allows you to output calendars like the Unix cal program, and provides additional useful functions related to the calendar. By default, these calendars have Monday as the first day of the week, and Sunday as the last (the European convention). Use setfirstweekday() to set the first day of the week to Sunday (6) or to any other weekday. Parameters that specify dates are given as integers. For related functionality, see also the datetime and time modules.

Most of these functions and classes rely on the datetime module which uses an idealized calendar, the current Gregorian calendar extended in both directions. This matches the definition of the “proleptic Gregorian” calendar.

Calendar instances have the following methods:

The calendar.TextCalendar Class

TextCalendar instances have the following methods:

The calendar.HTMLCalendar Class

HTMLCalendar instances have the following methods:

Note: The formatweekday() and formatmonthname() methods of these two classes temporarily change the current locale to the given locale. Because the current locale is a process-wide setting, they are not thread-safe.

calendar Functions

For simple text calendars, the calendar module provides the following functions:

The calendar module exports the following data attributes:

import calendarcalendar.setfirstweekday(calendar.SUNDAY)

Collections: container datatypes

This module implements specialized container datatypes providing alternatives to Python’s general purpose built-in containers, dict, list, set, and tuple.

collections.ChainMap objects

A ChainMap class is provided for quickly linking many mappings so they can be treated as a single unit. It is often much faster than creating a new dictionary and running multiple update() calls.

The class can simulate nested scopes and is useful in templating.

All of the usual dictionary methods are supported. Also, there is a maps attribute, a method for creating new subcontexts, and a property for accessing all but the first mapping:

collections.ChainMap Examples and Recipes

This section shows various approaches to working with chained maps.

Example of simulating Python’s internal lookup chain:

import builtins pylookup = ChainMap(locals(), globals(), vars(builtins))

Example of letting user specified command-line arguments take precedence over environment variables which in turn take precedence over default values:

import os, argparse defaults = {‘color’: ‘red’, ‘user’: ‘guest’} parser = argparse.ArgumentParser() parser.add_argument(’-u’, ‘–user’) parser.add_argument(’-c’, ‘–color’) namespace = parser.parse_args() command_line_args = {k:v for k, v in vars(namespace).items() if v} combined = ChainMap(command_line_args, os.environ, defaults) print(combined[‘color’]) print(combined[‘user’])

Example patterns for using the ChainMap class to simulate nested contexts:

c = ChainMap() # Create root context d = c.new_child() # Create nested child context e = c.new_child() # Child of c, independent from d e.maps[0] # Current context dictionary – like Python’s locals() e.maps[-1] # Root context – like Python’s globals() e.parents # Enclosing context chain – like Python’s nonlocals d[‘x’] # Get first key in the chain of contexts d[‘x’] = 1 # Set value in current context del d[‘x’] # Delete from current context list(d) # All nested values k in d # Check all nested values len(d) # Number of nested values d.items() # All nested items dict(d) # Flatten into a regular dictionary

The ChainMap class only makes updates (writes and deletions) to the first mapping in the chain while lookups search the full chain. However, if deep writes and deletions are desired, it is easy to make a subclass that updates keys found deeper in the chain:

class DeepChainMap(ChainMap): ‘Variant of ChainMap that allows direct updates to inner scopes’ def setitem(self, key, value): for mapping in self.maps: if key in mapping: mapping[key] = value return self.maps[0][key] = value def delitem(self, key): for mapping in self.maps: if key in mapping: del mapping[key] return raise KeyError(key)

d = DeepChainMap({‘zebra’: ‘black’}, {’elephant’: ‘blue’}, {’lion’: ‘yellow’}) d[’lion’] = ‘orange’ # update an existing key two levels down d[‘snake’] = ‘red’ # new keys get added to the topmost dict del d[’elephant’] # remove an existing key one level down DeepChainMap({‘zebra’: ‘black’, ‘snake’: ‘red’}, {}, {’lion’: ‘orange’})

collections.Counter objects

A counter tool is provided to support convenient and rapid tallies. For example:

Tally occurrences of words in a list

cnt = Counter() for word in [‘red’, ‘blue’, ‘red’, ‘green’, ‘blue’, ‘blue’]: … cnt[word] += 1 cnt Counter({‘blue’: 3, ‘red’: 2, ‘green’: 1})

Find the ten most common words in Hamlet

import re words = re.findall(r’\w+’, open(‘hamlet.txt’).read().lower()) Counter(words).most_common(10) [(’the’, 1143), (‘and’, 966), (’to’, 762), (‘of’, 669), (‘i’, 631), (‘you’, 554), (‘a’, 546), (‘my’, 514), (‘hamlet’, 471), (‘in’, 451)]

Elements are counted from an iterable or initialized from another mapping (or counter):

c = Counter() # a new, empty counter c = Counter(‘gallahad’) # a new counter from an iterable c = Counter({‘red’: 4, ‘blue’: 2}) # a new counter from a mapping c = Counter(cats=4, dogs=8) # a new counter from keyword args

Counter objects have a dictionary interface except that they return a zero count for missing items instead of raising a KeyError:

c = Counter([’eggs’, ‘ham’]) c[‘bacon’] # count of a missing element is zero 0

Setting a count to zero does not remove an element from a counter. Use del to remove it entirely:

c[‘sausage’] = 0 # counter entry with a zero count del c[‘sausage’] # del actually removes the entry

Counter objects support three methods beyond those available for all dictionaries:

The usual dictionary methods are available for Counter objects except for two which work differently for counters.

c = Counter(a=4, b=2, c=0, d=-2)»> list(c.elements())[‘a’, ‘a’, ‘a’, ‘a’, ‘b’, ‘b’]

Counter(‘abracadabra’).most_common(3)[(‘a’, 5), (‘r’, 2), (‘b’, 2)]

c = Counter(a=4, b=2, c=0, d=-2)»> d = Counter(a=1, b=2, c=3, d=4)»> c.subtract(d)»> cCounter({‘a’: 3, ‘b’: 0, ‘c’: -3, ’d’: -6})

Common patterns for working with Counter objects:

sum(c.values()) # total of all counts c.clear() # reset all counts list(c) # list unique elements set(c) # convert to a set dict(c) # convert to a regular dictionary c.items() # convert to a list of (elem, cnt) pairs Counter(dict(list_of_pairs)) # convert from a list of (elem, cnt) pairs c.most_common()[:-n-1:-1] # n least common elements +c # remove zero and negative counts

Several mathematical operations are provided for combining Counter objects to produce multisets (counters that have counts greater than zero). Addition and subtraction combine counters by adding or subtracting the counts of corresponding elements. Intersection and union return the minimum and maximum of corresponding counts. Each operation can accept inputs with signed counts, but the output will exclude results with counts of zero or less.

c = Counter(a=3, b=1) d = Counter(a=1, b=2) c + d # add two counters together: c[x] + d[x] Counter({‘a’: 4, ‘b’: 3}) c - d # subtract (keeping only positive counts) Counter({‘a’: 2}) c & d # intersection: min(c[x], d[x]) Counter({‘a’: 1, ‘b’: 1}) c | d # union: max(c[x], d[x]) Counter({‘a’: 3, ‘b’: 2})

Unary addition and subtraction are shortcuts for adding an empty counter or subtracting from an empty counter.

c = Counter(a=2, b=-4) +c Counter({‘a’: 2}) -c Counter({‘b’: 4})

Note: Counters were primarily designed to work with positive integers to represent running counts; however, care was taken to not unnecessarily preclude use cases needing other types or negative values. To help with those use cases, this section documents the minimum range and type restrictions.

  • The Counter class itself is a dictionary subclass with no restrictions on its keys and values. The values are intended to be numbers representing counts, but you could store anything in the value field.
  • The most_common() method requires only that the values be orderable.
  • For in-place operations such as c[key] += 1, the value type need only support addition and subtraction. So fractions, floats, and decimals would work and negative values are supported. The same is also true for update() and subtract() which allow negative and zero values for both inputs and outputs.
  • The multiset methods are designed only for use cases with positive values. The inputs may be negative or zero, but only outputs with positive values are created. There are no type restrictions, but the value type needs to support addition, subtraction, and comparison.
  • The elements() method requires integer counts. It ignores zero and negative counts.

collections.deque objects

Deque objects support the following methods:

Deque objects also provide one read-only attribute:

In addition to the above, deques support iteration, pickling, len(d), reversed(d), copy.copy(d), copy.deepcopy(d), membership testing with the in operator, and subscript references such as d[-1]. Indexed access is O(1) at both ends but slows to O(n) in the middle. For fast random access, use lists instead.

from collections import deque d = deque(‘ghi’) # make a new deque with three items for elem in d: # iterate over the deque’s elements … print(elem.upper()) G H I d.append(‘j’) # add a new entry to the right side d.appendleft(‘f’) # add a new entry to the left side d # show the representation of the deque deque([‘f’, ‘g’, ‘h’, ‘i’, ‘j’]) d.pop() # return and remove the rightmost item ‘j’ d.popleft() # return and remove the leftmost item ‘f’ list(d) # list the contents of the deque [‘g’, ‘h’, ‘i’] d[0] # peek at leftmost item ‘g’ d[-1] # peek at rightmost item ‘i’ list(reversed(d)) # list the contents of a deque in reverse [‘i’, ‘h’, ‘g’] ‘h’ in d # search the deque True d.extend(‘jkl’) # add multiple elements at once d deque([‘g’, ‘h’, ‘i’, ‘j’, ‘k’, ’l’]) d.rotate(1) # right rotation d deque([’l’, ‘g’, ‘h’, ‘i’, ‘j’, ‘k’]) d.rotate(-1) # left rotation d deque([‘g’, ‘h’, ‘i’, ‘j’, ‘k’, ’l’]) deque(reversed(d)) # make a new deque in reverse order deque([’l’, ‘k’, ‘j’, ‘i’, ‘h’, ‘g’]) d.clear() # empty the deque d.pop() # cannot pop from an empty deque Traceback (most recent call last): File “<pyshell#6>”, line 1, in -toplevel- d.pop() IndexError: pop from an empty deque d.extendleft(‘abc’) # extendleft() reverses the input order d deque([‘c’, ‘b’, ‘a’])

collections.deque Recipes

This section shows various approaches to working with deques.

Bounded length deques provide functionality similar to the tail filter in Unix:

def tail(filename, n=10): ‘Return the last n lines of a file’ with open(filename) as f: return deque(f, n)

Another approach to using deques is to maintain a sequence of recently added elements by appending to the right and popping to the left:

def moving_average(iterable, n=3): # moving_average([40, 30, 50, 46, 39, 44]) –> 40.0 42.0 45.0 43.0 # http://en.wikipedia.org/wiki/Moving_average it = iter(iterable) d = deque(itertools.islice(it, n-1)) d.appendleft(0) s = sum(d) for elem in it: s += elem - d.popleft() d.append(elem) yield s / n

The rotate() method provides a way to implement deque slicing and deletion. For example, a pure Python implementation of del d[n] relies on the rotate() method to position elements to be popped:

def delete_nth(d, n): d.rotate(-n) d.popleft() d.rotate(n)

To implement deque slicing, use a similar approach applying rotate() to bring a target element to the left side of the deque. Remove old entries with popleft(), add new entries with extend(), and then reverse the rotation. With minor variations on that approach, it is easy to implement Forth style stack manipulations such as dup, drop, swap, over, pick, rot, and roll.

collections.defaultdict Objects

defaultdict objects support the following method in addition to the standard dict operations:

defaultdict objects support the following instance variable:

collections.defaultdict Examples

Using list as the default_factory, it is easy to group a sequence of key-value pairs into a dictionary of lists:

s = [(‘yellow’, 1), (‘blue’, 2), (‘yellow’, 3), (‘blue’, 4), (‘red’, 1)] d = defaultdict(list) for k, v in s: … d[k].append(v) … list(d.items()) [(‘blue’, [2, 4]), (‘red’, [1]), (‘yellow’, [1, 3])]

When each key is encountered for the first time, it is not already in the mapping; so an entry is automatically created using the default_factory function which returns an empty list. The list.append() operation then attaches the value to the new list. When keys are encountered again, the look-up proceeds normally (returning the list for that key) and the list.append() operation adds another value to the list. This technique is simpler and faster than an equivalent technique using dict.setdefault():

d = {} for k, v in s: … d.setdefault(k, []).append(v) … list(d.items()) [(‘blue’, [2, 4]), (‘red’, [1]), (‘yellow’, [1, 3])]

Setting the default_factory to int makes the defaultdict useful for counting (like a bag or multiset in other languages):

s = ‘mississippi’ d = defaultdict(int) for k in s: … d[k] += 1 … list(d.items()) [(‘i’, 4), (‘p’, 2), (’s’, 4), (’m’, 1)]

When a letter is first encountered, it is missing from the mapping, so the default_factory function calls int() to supply a default count of zero. The increment operation then builds up the count for each letter.

The function int() which always returns zero is a special case of constant functions. A faster and more flexible way to create constant functions is to use a lambda function that can supply any constant value (not only zero):

def constant_factory(value): … return lambda: value d = defaultdict(constant_factory(’’)) d.update(name=‘John’, action=‘ran’) ‘%(name)s %(action)s to %(object)s’ % d ‘John ran to

Setting the default_factory to set makes the defaultdict useful for building a dictionary of sets:

s = [(‘red’, 1), (‘blue’, 2), (‘red’, 3), (‘blue’, 4), (‘red’, 1), (‘blue’, 4)] d = defaultdict(set) for k, v in s: … d[k].add(v) … list(d.items()) [(‘blue’, {2, 4}), (‘red’, {1, 3})]

collections.namedtuple() Factory Function for Tuples with Named Fields

Named tuples assign meaning to each position in a tuple and allow for more readable, self-documenting code. They can be used wherever regular tuples are used, and they add the ability to access fields by name instead of position index.

Basic example

Point = namedtuple(‘Point’, [‘x’, ‘y’]) p = Point(11, y=22) # instantiate with positional or keyword arguments p[0] + p[1] # indexable like the plain tuple (11, 22) 33 x, y = p # unpack like a regular tuple x, y (11, 22) p.x + p.y # fields also accessible by name 33 p # readable repr with a name=value style Point(x=11, y=22)

Named tuples are especially useful for assigning field names to result tuples returned by the csv or sqlite3 modules:

EmployeeRecord = namedtuple(‘EmployeeRecord’, ’name, age, title, department, paygrade’) import csv for emp in map(EmployeeRecord._make, csv.reader(open(“employees.csv”, “rb”))): print(emp.name, emp.title) import sqlite3 conn = sqlite3.connect(’/companydata’) cursor = conn.cursor() cursor.execute(‘SELECT name, age, title, department, paygrade FROM employees’) for emp in map(EmployeeRecord._make, cursor.fetchall()): print(emp.name, emp.title)

In addition to the methods inherited from tuples, named tuples support three additional methods and two attributes. To prevent conflicts with field names, the method and attribute names start with an underscore.

To retrieve a field whose name is stored in a string, use the getattr() function:

t = [11, 22]»> Point._make(t)Point(x=11, y=22)

vars(p)OrderedDict([(‘x’, 11), (‘y’, 22)])

p = Point(x=11, y=22)»> p._replace(x=33)Point(x=33, y=22)»> for partnum, record in inventory.items():… inventory[partnum] = record._replace(price=newprices[partnum], \ timestamp=time.now())

p._fields # view the field names(‘x’, ‘y’)»> Color = namedtuple(‘Color’, ‘red green blue’)»> Pixel = namedtuple(‘Pixel’, Point._fields + Color._fields)»> Pixel(11, 22, 128, 255, 0)Pixel(x=11, y=22, red=128, green=255, blue=0)

getattr(p, ‘x’) 11

To convert a dictionary to a named tuple, use the double-star-operator (as described in Unpacking Argument Lists):

d = {‘x’: 11, ‘y’: 22} Point(**d) Point(x=11, y=22)

Since a named tuple is a regular Python class, it is easy to add or change functionality with a subclass. Here is how to add a calculated field and a fixed-width print format:

class Point(namedtuple(‘Point’, ‘x y’)): slots = () @property def hypot(self): return (self.x ** 2 + self.y ** 2) ** 0.5 def str(self): return ‘Point: x=%6.3f y=%6.3f hypot=%6.3f’ % (self.x, self.y, self.hypot)

for p in Point(3, 4), Point(14, 5/7): print(p) Point: x= 3.000 y= 4.000 hypot= 5.000 Point: x=14.000 y= 0.714 hypot=14.018

The subclass shown above sets slots to an empty tuple. This helps keep memory requirements low by preventing the creation of instance dictionaries.

Subclassing is not useful for adding new, stored fields. Instead, create a new named tuple type from the _fields attribute:

Point3D = namedtuple(‘Point3D’, Point._fields + (‘z’,))

Default values can be implemented using _replace() to customize a prototype instance:

Account = namedtuple(‘Account’, ‘owner balance transaction_count’) default_account = Account(’’, 0.0, 0) johns_account = default_account._replace(owner=‘John’) janes_account = default_account._replace(owner=‘Jane’)

Enumerated constants can be implemented with named tuples, but it is simpler and more efficient to use a simple class declaration:

Status = namedtuple(‘Status’, ‘open pending closed’)._make(range(3)) Status.open, Status.pending, Status.closed (0, 1, 2) class Status: open, pending, closed = range(3)

collections.OrderedDict objects

Ordered dictionaries are like regular dictionaries but they remember the order that items were inserted. When iterating over an ordered dictionary, the items are returned in the order their keys were first added.

methods:

In addition to the usual mapping methods, ordered dictionaries also support reverse iteration using reversed().

d = OrderedDict.fromkeys(‘abcde’)»> d.move_to_end(‘b’)»> ‘’.join(d.keys())‘acdeb’»> d.move_to_end(‘b’, last=False)»> ‘’.join(d.keys())‘bacde’

Equality tests between OrderedDict objects are order-sensitive and are implemented as list(od1.items())==list(od2.items()). Equality tests between OrderedDict objects and other Mapping objects are order-insensitive like regular dictionaries. This allows OrderedDict objects to be substituted anywhere a regular dictionary is used.

The OrderedDict constructor and update() method both accept keyword arguments, but their order is lost because Python’s function call semantics pass-in keyword arguments using a regular unordered dictionary.

collections.OrderedDict Examples and Recipes

Since an ordered dictionary remembers its insertion order, it can be used in conjunction with sorting to make a sorted dictionary:

regular unsorted dictionary

d = {‘banana’: 3, ‘apple’:4, ‘pear’: 1, ‘orange’: 2}

dictionary sorted by key

OrderedDict(sorted(d.items(), key=lambda t: t[0])) OrderedDict([(‘apple’, 4), (‘banana’, 3), (‘orange’, 2), (‘pear’, 1)])

dictionary sorted by value

OrderedDict(sorted(d.items(), key=lambda t: t[1])) OrderedDict([(‘pear’, 1), (‘orange’, 2), (‘banana’, 3), (‘apple’, 4)])

dictionary sorted by length of the key string

OrderedDict(sorted(d.items(), key=lambda t: len(t[0]))) OrderedDict([(‘pear’, 1), (‘apple’, 4), (‘orange’, 2), (‘banana’, 3)])

The new sorted dictionaries maintain their sort order when entries are deleted. But when new keys are added, the keys are appended to the end and the sort is not maintained.

It is also straightforward to create an ordered dictionary variant that remembers the order the keys were last inserted. If a new entry overwrites an existing entry, the original insertion position is changed and moved to the end:

class LastUpdatedOrderedDict(OrderedDict): ‘Store items in the order the keys were last added’ def setitem(self, key, value): if key in self: del self[key] OrderedDict.setitem(self, key, value)

An ordered dictionary can be combined with the Counter class so that the counter remembers the order elements are first encountered:

class OrderedCounter(Counter, OrderedDict): ‘Counter that remembers the order elements are first encountered’ def repr(self): return ‘%s(%r)’ % (self.class.name, OrderedDict(self)) def reduce(self): return self.class, (OrderedDict(self),)

collections.UserDict objects

The class, UserDict acts as a wrapper around dictionary objects. The need for this class is partially supplanted by the ability to subclass directly from dict; however, this class can be easier to work with because the underlying dictionary is accessible as an attribute.

In addition to supporting the methods and operations of mappings, UserDict instances provide the following attribute:

collections.UserList objects

This class acts as a wrapper around list objects. It is a useful base class for your list-like classes that can inherit from them and override existing methods or add new ones. In this way, one can add new behaviors to lists.

The need for this class is partially supplanted by the ability to subclass directly from list; however, this class can be easier to work with because the underlying list is accessible as an attribute.

In addition to supporting the methods and operations of mutable sequences, UserList instances provide the following attribute:

Subclassing requirements: Subclasses of UserList are expected to offer a constructor that can be called with either no arguments or one argument. List operations which return a new sequence attempt to create an instance of the actual implementation class. To do so, it assumes that the constructor can be called with a single parameter, which is a sequence object used as a data source.

If a derived class does not want to comply with this requirement, all of the special methods supported by this class need to be overridden; please consult the sources for information about the methods which need to be provided in that case.

collections.UserString objects

The class, UserString acts as a wrapper around string objects. The need for this class is partially supplanted by the ability to subclass directly from str; however, this class can be easier to work with because the underlying string is accessible as an attribute.

Bisect: array bisection algorithm

This module provides support for maintaining a list in sorted order without having to sort the list after each insertion. For long lists of items with expensive comparison operations, this is an improvement over the more common approach. The module is called bisect because it uses a basic bisection algorithm to do its work. The source code may be most useful as a working example of the algorithm (the boundary conditions are already right!).

The following functions are provided:

Using bisect To Search Sorted Lists

The above bisect() functions are useful for finding insertion points but can be tricky or awkward to use for common searching tasks. The following five functions show how to transform them into the standard look ups for sorted lists:

def index(a, x): ‘Locate the leftmost value exactly equal to x’ i = bisect_left(a, x) if i != len(a) and a[i] == x: return i raise ValueError def find_lt(a, x): ‘Find rightmost value less than x’ i = bisect_left(a, x) if i: return a[i-1] raise ValueError def find_le(a, x): ‘Find rightmost value less than or equal to x’ i = bisect_right(a, x) if i: return a[i-1] raise ValueError def find_gt(a, x): ‘Find leftmost value greater than x’ i = bisect_right(a, x) if i != len(a): return a[i] raise ValueError def find_ge(a, x): ‘Find leftmost item greater than or equal to x’ i = bisect_left(a, x) if i != len(a): return a[i] raise ValueError

Examples

The bisect() function can be useful for numeric table look ups. This example uses bisect() to look up a letter grade for an exam score (say) based on a set of ordered numeric breakpoints: 90 and up is an ‘A’, 80 to 89 is a ‘B’, and so on:

def grade(score, breakpoints=[60, 70, 80, 90], grades=‘FDCBA’): … i = bisect(breakpoints, score) … return grades[i] … [grade(score) for score in [33, 99, 77, 70, 89, 90, 100]] [‘F’, ‘A’, ‘C’, ‘C’, ‘B’, ‘A’, ‘A’]

Unlike the sorted() function, it does not make sense for the bisect() functions to have key or reversed arguments because that would lead to an inefficient design (successive calls to bisect functions would not “remember” all of the previous key look ups).

Instead, it is better to search a list of precomputed keys to find the index of the record in question:

data = [(‘red’, 5), (‘blue’, 1), (‘yellow’, 8), (‘black’, 0)] data.sort(key=lambda r: r[1]) keys = [r[1] for r in data] # precomputed list of keys data[bisect_left(keys, 0)] (‘black’, 0) data[bisect_left(keys, 1)] (‘blue’, 1) data[bisect_left(keys, 5)] (‘red’, 5) data[bisect_left(keys, 8)] (‘yellow’, 8)

Array: efficient arrays of numeric values

This module defines an object type that can compactly represent an array of basic values: characters, integers, floating point numbers. Arrays are sequence types and behave very much like lists, except that the type of objects stored in them is constrained. The type is specified at object creation time using a type code, which is a single character. The following type codes are defined:

  • The ‘u’ type code corresponds to Python’s obsolete unicode character (Py_UNICODE which is wchar_t). Depending on the platform, it can be 16 bits or 32 bits. ‘u’ will be removed together with the rest of the Py_UNICODE API.
  • The ‘q’ and ‘Q’ type codes are available only if the platform C compiler used to build Python supports C long long, or, on Windows, __int64.

The actual representation of values is determined by the machine architecture (strictly speaking, by the C implementation). The actual size can be accessed through the itemsize attribute.

The module defines the following types:

Array objects support the ordinary sequence operations of indexing, slicing, concatenation, and multiplication. When using slice assignment, the assigned value must be an array object with the same type code; in all other cases, TypeError is raised. Array objects also implement the buffer interface, and may be used wherever bytes-like objects are supported.

The following data items and methods are also supported:

When an array object is printed or converted to a string, it is represented as array(typecode, initializer). The initializer is omitted if the array is empty, otherwise it is a string if the typecode is ‘u’, otherwise it is a list of numbers. The string is guaranteed to be able to be converted back to an array with the same type and value using eval(), so long as the array() function is imported using from array import array. Examples:

array(’l’) array(‘u’, ‘hello \u2641’) array(’l’, [1, 2, 3, 4, 5]) array(’d’, [1.0, 2.0, 3.14])

Enum: support for enumerations

An enumeration is a set of symbolic names (members) bound to unique, constant values. Within an enumeration, the members can be compared by identity, and the enumeration itself can be iterated over.

enum Module Contents

This module defines two enumeration classes that can define unique sets of names and values: Enum and IntEnum. It also defines one decorator, unique().

Creating an enum.Enum

Enumerations are created using the class syntax, which makes them easy to read and write. To define an enumeration, subclass Enum as follows:

from enum import Enum class Color(Enum): … red = 1 … green = 2 … blue = 3 …

  • The class Color is an enumeration (or enum)
  • The attributes Color.red, Color.green, etc., are enumeration members (or enum members).
  • The enum members have names and values (the name of Color.red is red, the value of Color.blue is 3, etc.)

Note:

Even though we use the class syntax to create Enums, Enums are not normal Python classes.

Enumeration members have human readable string representations:

print(Color.red) Color.red

…while their repr has more information:

print(repr(Color.red)) <Color.red: 1>

The type of an enumeration member is the enumeration it belongs to:

type(Color.red) <enum ‘Color’> isinstance(Color.green, Color) True

Enum members also have a property containing their item name:

print(Color.red.name) red

Enumerations support iteration, in definition order:

class Shake(Enum): … vanilla = 7 … chocolate = 4 … cookies = 9 … mint = 3 … for shake in Shake: … print(shake) … Shake.vanilla Shake.chocolate Shake.cookies Shake.mint

Enumeration members are hashable, so they can be used in dictionaries and sets:

apples = {} apples[Color.red] = ‘red delicious’ apples[Color.green] = ‘granny smith’ apples == {Color.red: ‘red delicious’, Color.green: ‘granny smith’} True

Programmatic Access To Enumeration Members And Their Attributes

Sometimes it’s useful to access members in enumerations programmatically (i.e., situations where Color.red won’t do because the exact color is not known at program-writing time). Enum allows such access:

Color(1) <Color.red: 1> Color(3) <Color.blue: 3>

If you want to access enum members by name, use item access:

Color[‘red’] <Color.red: 1> Color[‘green’] <Color.green: 2>

If you have an enum member and need its name or value:

member = Color.red member.name ‘red’ member.value 1

Duplicating enum Members And Values

Having two enum members with the same name is invalid:

class Shape(Enum): … square = 2 … square = 3 … Traceback (most recent call last): … TypeError: Attempted to reuse key: ‘square’

However, two enum members are allowed to have the same value. Given two members A and B with the same value (and A defined first), B is an alias to A. By-value lookup of the value of A and B returns A. By-name lookup of B also return A:

class Shape(Enum): … square = 2 … diamond = 1 … circle = 3 … alias_for_square = 2 … Shape.square <Shape.square: 2> Shape.alias_for_square <Shape.square: 2> Shape(2) <Shape.square: 2>

Note: Attempting to create a member with the same name as an already defined attribute (another member, a method, etc.) or attempting to create an attribute with the same name as a member is not allowed.

Ensuring Unique Enumeration Values

By default, enumerations allow multiple names as aliases for the same value. When this behavior isn’t desired, the following decorator can ensure each value is used only once in the enumeration:

@enum.unique

A class decorator specifically for enumerations. It searches an enumeration’s members gathering any aliases it finds; if any are found ValueError is raised with the details:

from enum import Enum, unique @unique … class Mistake(Enum): … one = 1 … two = 2 … three = 3 … four = 3 … Traceback (most recent call last): … ValueError: duplicate values found in <enum ‘Mistake’>: four -> three

Iteration

Iterating over the members of an enum does not provide the aliases:

list(Shape) [<Shape.square: 2>, <Shape.diamond: 1>, <Shape.circle: 3>]

The special attribute members is an ordered dictionary mapping names to members. It includes all names defined in the enumeration, including the aliases:

for name, member in Shape.members.items(): … name, member … (‘square’, <Shape.square: 2>) (‘diamond’, <Shape.diamond: 1>) (‘circle’, <Shape.circle: 3>) (‘alias_for_square’, <Shape.square: 2>)

The members attribute can be used for detailed programmatic access to the enumeration members. For example, finding all the aliases:

[name for name, member in Shape.members.items() if member.name != name] [‘alias_for_square’]

Comparisons

Enumeration members are compared by identity:

Color.red is Color.red True Color.red is Color.blue False Color.red is not Color.blue True

Ordered comparisons between enumeration values are not supported. Enum members are not integers (but see IntEnum below):

Color.red < Color.blue Traceback (most recent call last): File “”, line 1, in TypeError: unorderable types: Color() < Color()

Equality comparisons are defined though:

Color.blue == Color.red False Color.blue != Color.red True Color.blue == Color.blue True

Comparisons against non-enumeration values always compare not equal (again, IntEnum was explicitly designed to behave differently, see below):

Color.blue == 2 False

Allowed Members And Attributes Of Enumerations

The examples above use integers for enumeration values. Using integers is short and handy (and provided by default by the Functional API), but not strictly enforced. In the vast majority of use-cases, one doesn’t care what the actual value of an enumeration is. But if the value is important, enumerations can have arbitrary values.

Enumerations are Python classes, and can have methods and special methods as usual. If we have this enumeration:

class Mood(Enum): … funky = 1 … happy = 3 … … def describe(self): … # self is the member here … return self.name, self.value … … def str(self): … return ‘my custom str! {0}’.format(self.value) … … @classmethod … def favorite_mood(cls): … # cls here is the enumeration … return cls.happy …

Then:

Mood.favorite_mood() <Mood.happy: 3> Mood.happy.describe() (‘happy’, 3) str(Mood.funky) ‘my custom str! 1’

The rules for what is allowed are as follows: sunder names (starting and ending with a single underscore) are reserved by enum and cannot be used; all other attributes defined within an enumeration become members of this enumeration, except for dunder names and descriptors (methods are also descriptors).

Note: if your enumeration defines new() and/or init() then whatever value(s) were given to the enum member will be passed into those methods.

Restricted Subclassing Of Enumerations

Subclassing an enumeration is allowed only if the enumeration does not define any members. So this is forbidden:

class MoreColor(Color): … pink = 17 … Traceback (most recent call last): … TypeError: Cannot extend enumerations

But this is allowed:

class Foo(Enum): … def some_behavior(self): … pass … class Bar(Foo): … happy = 1 … sad = 2 …

Allowing subclassing of enums that define members would lead to a violation of some important invariants of types and instances. On the other hand, it makes sense to allow sharing some common behavior between a group of enumerations. (See OrderedEnum for an example.)

Pickling

Enumerations can be pickled and unpickled:

from test.test_enum import Fruit from pickle import dumps, loads Fruit.tomato is loads(dumps(Fruit.tomato)) True

The usual restrictions for pickling apply: picklable enums must be defined in the top level of a module since unpickling requires them to be importable from that module.

With pickle protocol version 4 it is possible to easily pickle enums nested in other classes.

It is possible to modify how Enum members are pickled/unpickled by defining reduce_ex() in the enumeration class.

enum Functional API

The Enum class is callable, providing the following functional API:

Animal = Enum(‘Animal’, ‘ant bee cat dog’) Animal <enum ‘Animal’> Animal.ant <Animal.ant: 1> Animal.ant.value 1 list(Animal) [<Animal.ant: 1>, <Animal.bee: 2>, <Animal.cat: 3>, <Animal.dog: 4>]

The semantics of this API resemble namedtuple. The first argument of the call to Enum is the name of the enumeration.

The second argument is the source of enumeration member names. It is a whitespace-separated string of names, a sequence of names, a sequence of 2-tuples with key/value pairs, or a mapping (e.g., dictionary) of names to values. The last two options enable assigning arbitrary values to enumerations; the others auto-assign increasing integers starting with 1. A new class derived from Enum is returned. In other words, the above assignment to Animal is equivalent to:

class Animals(Enum): … ant = 1 … bee = 2 … cat = 3 … dog = 4 …

The reason for defaulting to 1 as the starting number and not 0 is that 0 is False in a boolean sense, but enum members all evaluate to True.

Pickling enums created with the functional API can be tricky as frame stack implementation details are used to try and figure out the module of the enumeration. The solution is to specify the module name explicitly as follows:

Animals = Enum(‘Animals’, ‘ant bee cat dog’, module=name)

The new pickle protocol 4 also, in some circumstances, relies on qualname being set to the location where pickle can find the class. For example, if the class was made available in class SomeData in the global scope:

If module is not supplied, and Enum cannot determine what it is, the new Enum members isn’t unpicklable; to keep errors closer to the source, pickling will be disabled.

Animals = Enum(‘Animals’, ‘ant bee cat dog’, qualname=‘SomeData.Animals’)

The complete signature is:

Enum(value=‘NewEnumName’, names=<…>, *, module=’…’, qualname=’…’, type=)

Derived enumerations

IntEnum

A variation of Enum is provided that is also a subclass of int. Members of an IntEnum can be compared to integers; by extension, integer enumerations of different types can also be compared to each other:

from enum import IntEnum class Shape(IntEnum): … circle = 1 … square = 2 … class Request(IntEnum): … post = 1 … get = 2 … Shape == 1 False Shape.circle == 1 True Shape.circle == Request.post True

However, they still can’t be compared to standard Enum enumerations:

class Shape(IntEnum): … circle = 1 … square = 2 … class Color(Enum): … red = 1 … green = 2 … Shape.circle == Color.red False

IntEnum values behave like integers in other ways you’d expect:

int(Shape.circle) 1 [‘a’, ‘b’, ‘c’][Shape.circle] ‘b’ [i for i in range(Shape.square)] [0, 1]

For the vast majority of code, Enum is strongly recommended since IntEnum breaks some semantic promises of an enumeration (by being comparable to integers, and thus by transitivity to other unrelated enumerations). It should be used only in special cases where there’s no other choice; for example, when integer constants are replaced with enumerations and backward compatibility is required with code that still expects integers.

Other Enums

While IntEnum is part of the enum module, it would be very simple to implement independently:

class IntEnum(int, Enum): pass

This demonstrates how similar derived enumerations can be defined; for example a StrEnum that mixes in str instead of int.

Some rules:

  • When subclassing Enum, mix-in types must appear before Enum itself in the sequence of bases, as in the IntEnum example above.
  • While Enum can have members of any type, once you mix in an additional type, all the members must have values of that type, e.g., int above. This restriction does not apply to mix-ins which only add methods and don’t specify another data type such as int or str.
  • When another data type is mixed in, the value attribute is not the same as the enum member itself, although it is equivalent and will compare equal.
  • %-style formatting: %s and %r call Enum‘s str() and repr() respectively; other codes (such as %i or %h for IntEnum) treat the enum member as its mixed-in type.
  • str.format() (or format()) use the mixed-in type’s format(). If the Enum‘s str() or repr() is desired use the !s or !r str format codes.

While Enum and IntEnum are expected to cover the majority of use-cases, they cannot cover them all. Here are recipes for some different types of enumerations that can be used directly, or as examples for creating one’s own.

enum Example: AutoNumber

Avoids having to specify the value for each enumeration member:

class AutoNumber(Enum): … def new(cls): … value = len(cls.members) + 1 … obj = object.new(cls) … obj.value = value … return obj … class Color(AutoNumber): … red = () … green = () … blue = () … Color.green.value == 2 True

Note: The new() method, if defined, is used during creation of the Enum members; it is then replaced by Enum’s new() which is used after class creation for look up of existing members. Due to the way Enums are supposed to behave, there is no way to customize Enum’s new().

enum example: OrderedEnum

An ordered enumeration that is not based on IntEnum and so maintains the normal Enum invariants (such as not being comparable to other enumerations):

class OrderedEnum(Enum): … def ge(self, other): … if self.class is other.class: … return self.value >= other.value … return NotImplemented … def gt(self, other): … if self.class is other.class: … return self.value > other.value … return NotImplemented … def le(self, other): … if self.class is other.class: … return self.value <= other.value … return NotImplemented … def lt(self, other): … if self.class is other.class: … return self.value < other.value … return NotImplemented … class Grade(OrderedEnum): … A = 5 … B = 4 … C = 3 … D = 2 … F = 1 … Grade.C < Grade.A True

enum Example: DuplicateFreeEnum

Raises an error if a duplicate member name is found instead of creating an alias:

class DuplicateFreeEnum(Enum): … def init(self, *args): … cls = self.class … if any(self.value == e.value for e in cls): … a = self.name … e = cls(self.value).name … raise ValueError( … “aliases not allowed in DuplicateFreeEnum: %r –> %r” … % (a, e)) … class Color(DuplicateFreeEnum): … red = 1 … green = 2 … blue = 3 … grene = 2 … Traceback (most recent call last): … ValueError: aliases not allowed in DuplicateFreeEnum: ‘grene’ –> ‘green’

This is a useful example for subclassing Enum to add or change other behaviors and disallowing aliases. If the only desired change is disallowing aliases, the unique() decorator can be used instead.

enum Example: Planet

If new() or init() is defined the value of the enum member will be passed to those methods:

class Planet(Enum): … MERCURY = (3.303e+23, 2.4397e6) … VENUS = (4.869e+24, 6.0518e6) … EARTH = (5.976e+24, 6.37814e6) … MARS = (6.421e+23, 3.3972e6) … JUPITER = (1.9e+27, 7.1492e7) … SATURN = (5.688e+26, 6.0268e7) … URANUS = (8.686e+25, 2.5559e7) … NEPTUNE = (1.024e+26, 2.4746e7) … def init(self, mass, radius): … self.mass = mass # in kilograms … self.radius = radius # in meters … @property … def surface_gravity(self): … # universal gravitational constant (m3 kg-1 s-2) … G = 6.67300E-11 … return G * self.mass / (self.radius * self.radius) … Planet.EARTH.value (5.976e+24, 6378140.0) Planet.EARTH.surface_gravity 9.802652743337129