Misc utilities.

class oauth_dropins.webutil.util.CacheDict[source]

Bases: dict

A dict that also implements memcache’s get_multi() and set_multi() methods.

Useful as a simple in memory replacement for App Engine’s memcache API for e.g. get_activities_response() in snarfed/activitystreams-unofficial.

set(key, val, **kwargs)[source]
set_multi(updates, **kwargs)[source]
class oauth_dropins.webutil.util.FileLimiter(file_obj, read_limit)[source]

Bases: object

A file object wrapper that reads up to a limit and then reports EOF.

From http://stackoverflow.com/a/29838711/186123 . Thanks SO!

class oauth_dropins.webutil.util.SimpleTzinfo[source]

Bases: datetime.tzinfo

A simple, DST-unaware tzinfo subclass.


datetime -> DST offset in minutes east of UTC.

offset = datetime.timedelta(0)

datetime -> minutes east of UTC (negative for west of UTC).

class oauth_dropins.webutil.util.Struct(**kwargs)[source]

Bases: object

A generic class that initializes its attributes from constructor kwargs.

class oauth_dropins.webutil.util.UrlCanonicalizer(scheme='https', domain=None, subdomain=None, approve=None, reject=None, query=False, fragment=False, trailing_slash=False, redirects=True, headers=None)[source]

Bases: object

Converts URLs to their canonical form.

If an input URL matches approve or reject, it’s automatically approved as is without following redirects.

If we HEAD the URL to follow redirects and it returns 4xx or 5xx, we return None.

oauth_dropins.webutil.util.add_query_params(url, params)[source]

Adds new query parameters to a URL. Encodes as UTF-8 and URL-safe.

  • url – string URL or urllib2.Request. May already have query parameters.
  • params – dict or list of (string key, string value) tuples. Keys may repeat.

Returns: string URL


Converts a timezone-aware datetime to a naive UTC datetime.

If input is timezone-naive, it’s returned as is.

Doesn’t support DST!


Returns the base of a given URL.

For example, returns ‘http://site/posts/’ for ‘http://site/posts/123’.

Parameters:url – string

Removes transient query params (e.g. utm_*) from a URL.

The utm_* (Urchin Tracking Metrics?) params come from Google Analytics. https://support.google.com/analytics/answer/1033867

The source=rss-… params are on all links in Medium’s RSS feeds.

Parameters:url – string

Returns: string, the cleaned url, or None if it can’t be parsed


Normalizes and de-dupes http(s) URLs.

Converts domain to lower case, adds trailing slash when path is empty, and ignores scheme (http vs https), preferring https. Preserves order.

Domains are case insensitive, even modern domains with Unicode/punycode characters:

http://unicode.org/faq/idn.html#6 https://tools.ietf.org/html/rfc4343#section-5

As examples, http://foo/ and https://FOO are considered duplicates, but http://foo/bar and http://foo/bar/ aren’t.

Background: https://en.wikipedia.org/wiki/URL_normalization

TODO: port to https://pypi.python.org/pypi/urlnorm

Parameters:urls – sequence of string URLs
Returns:sequence of string URLs

Extracts and returns the meaningful domain from a URL.

Strips www., mobile., and m. from the beginning of the domain.

Parameters:url – string

Returns: string

oauth_dropins.webutil.util.domain_or_parent_in(input, domains)[source]

Returns True if an input domain or its parent is in a set of domains.


foo, [] => False
foo, [foo] => True
foo.bar.com, [bar.com] => True
foo.bar.com, [.bar.com] => True
foo.bar.com, [fux.bar.com] => False
bar.com, [fux.bar.com] => False
  • input – string domain
  • domains – sequence of string domains

Returns: boolean

oauth_dropins.webutil.util.ellipsize(str, words=14, chars=140)[source]

Truncates and ellipsizes str if it’s longer than words or chars.

Words are simply tokenized on whitespace, nothing smart.

Returns a list of unique string URLs in the given text.

URLs in the returned list are in the order they first appear in the text.

oauth_dropins.webutil.util.follow_redirects(url, cache=None, fail_cache_time_secs=86400, **kwargs)[source]

Fetches a URL with HEAD, repeating if necessary to follow redirects.

Does not raise an exception if any of the HTTP requests fail, just returns the failed response. If you care, be sure to check the returned response’s status code!

  • url – string
  • cache – optional, a cache object to read and write resolved URLs to. Must have get(key) and set(key, value, time=…) methods. Stores ‘R [original URL]’ in key, final URL in value.
  • kwargs – passed to requests.head()

the requests.Response for the final request. The url attribute has the

final URL.


Strips the fragment (e.g. ‘#foo’) from a URL.

Parameters:url – string

Returns: string URL


Generates a URL-safe random secret string.

Uses App Engine’s os.urandom(), which is designed to be cryptographically secure: http://code.google.com/p/googleappengine/issues/detail?id=1055

Parameters:bytes – integer, length of string to generate

Returns: random string

oauth_dropins.webutil.util.get_first(dict, key, default=None)[source]

Returns the first element of a dict value.

If the value is a list or tuple, returns the first value. If it’s something else, returns the value itself. If the key doesn’t exist, returns None.

oauth_dropins.webutil.util.get_list(dict, key)[source]

Returns a value from a dict as a list.

If the value is a list or tuple, it’s converted to a list. If it’s something else, it’s returned as a single-element list. If the key doesn’t exist, returns [].

oauth_dropins.webutil.util.get_required_param(handler, name)[source]
oauth_dropins.webutil.util.if_changed(cache, updates, key, value)[source]

Returns a value if it’s different from the cached value, otherwise None.

Values that evaluate to False are considered equivalent to None, in order to save cache space.

If the values differ, updates[key] is set to value. You can use this to collect changes that should be made to the cache in batch. None values in updates mean that the corresponding key should be deleted.

  • cache – any object with a get(key) method
  • updates – mapping (e.g. dict)
  • key – anything supported by cache
  • value – anything supported by cache

Returns: value or None

oauth_dropins.webutil.util.ignore_http_4xx_error(*args, **kwds)[source]

Extracts the status code and response from different HTTP exception types.

  • exception – one of:
  • apiclient.errors.HttpError (*) –
  • exc.WSGIHTTPException (*) –
  • gdata.client.RequestError (*) –
  • oauth2client.client.AccessTokenRefreshError (*) –
  • requests.HTTPError (*) –
  • urllib2.HTTPError (*) –
  • urllib2.URLError (*) –

Returns: (string status code or None, string response body or None)


Returns True if arg is a base64 encoded string, False otherwise.


Returns True if the given exception is a network connection failure.

…False otherwise.


Returns True if arg can be converted to a float, False otherwise.


Returns True if arg can be converted to an integer, False otherwise.

oauth_dropins.webutil.util.linkify(text, pretty=False, skip_bare_cc_tlds=False, **kwargs)[source]

Adds HTML links to URLs in the given plain text.

For example: linkify('Hello http://tornadoweb.org!') would return ‘Hello <a href=”http://tornadoweb.org”>http://tornadoweb.org</a>!’

Ignores URLs that are inside HTML links, ie anchor tags that look like <a href=”…”> .

  • text – string, input
  • pretty – if True, uses pretty_link() for link text

Returns: string, linkified input


Reads lines from a file and returns them as a set.

Leading and trailing whitespace is trimmed. Blank lines and lines beginning with # (ie comments) are ignored.

Parameters:file – a file object or other iterable that returns lines

Returns: set of strings


Tries to convert an ISO 8601 date/time string to RFC 3339.

The formats are similar, but not identical, eg. RFC 3339 includes a colon in the timezone offset at the end (+0000 instead of +00:00), but ISO 8601 doesn’t.

If the input can’t be parsed as ISO 8601, it’s silently returned, unchanged!



Tries to convert a string or int UNIX timestamp to RFC 3339.

oauth_dropins.webutil.util.parse_acct_uri(uri, hosts=None)[source]

Parses acct: URIs of the form acct:user@example.com .

Background: http://hueniverse.com/2009/08/making-the-case-for-a-new-acct-uri-scheme/

  • uri – string
  • hosts – sequence of allowed hosts (usually domains). None means allow all.

Returns: (username, host) tuple

Raises: ValueError if the uri is invalid or the host isn’t allowed.


Parses an ISO 8601 or RFC 3339 date/time string and returns a datetime.

Time zone designator is optional. If present, the returned datetime will be time zone aware.

Parameters:str – string ISO 8601 or RFC 3339, e.g. ‘2012-07-23T05:54:49+00:00’

Returns: datetime


Returns the domain and name in a tag URI string.

Inverse of tag_uri().

Returns: (string domain, string name) tuple, or None if the tag URI couldn’t
be parsed

Renders a pretty, short HTML link to a URL.

If text is not provided, the link text is the URL without the leading http(s)://[www.], ellipsized at the end if necessary. URL escape characters and UTF-8 are decoded.

The default maximum length follow’s Twitter’s rules: full domain plus 15 characters of path (including leading slash). * https://dev.twitter.com/docs/tco-link-wrapper/faq * https://dev.twitter.com/docs/counting-characters

  • url – string
  • text – string, optional
  • keep_host – if False, remove the host from the link text
  • glyphicon – string glyphicon to render after the link text, if provided. Details: http://glyphicons.com/
  • attrs – dict of attributes => values to include in the a tag. optional
  • new_tab – boolean, include target=”_blank” if True
  • max_length – int, max link text length in characters. ellipsized beyond this.

Returns: unicode string HTML snippet with <a> tag


Wraps requests.* and logs the HTTP method and URL.

oauth_dropins.webutil.util.requests_get(url, *args, **kwargs)
oauth_dropins.webutil.util.requests_head(url, *args, **kwargs)
oauth_dropins.webutil.util.requests_post(url, *args, **kwargs)
oauth_dropins.webutil.util.schemeless(url, slashes=True)[source]

Strips the scheme (e.g. ‘https:’) from a URL.

  • url – string
  • leading_slashes – if False, also strips leading slashes and trailing slash, e.g. ‘http://example.com/’ becomes ‘example.com’

Returns: string URL

oauth_dropins.webutil.util.tag_uri(domain, name, year=None)[source]

Returns a tag URI string for the given domain and name.

Example return value: ‘tag:twitter.com,2012:snarfed_org/172417043893731329

Background on tag URIs: http://taguri.org/


Converts a datetime to a float POSIX timestamp (seconds since epoch).


Renders a dict (usually from JSON) as an XML snippet.

Splits text into link and non-link text.

  • text – string to linkify
  • skip_bare_cc_tlds – boolean, whether to skip links of the form [domain].[2-letter TLD] with no schema and no path

Returns: a tuple containing two lists of strings, a list of links and list of non-link text. Roughly equivalent to the output of re.findall and re.split, with some post-processing.


Recursively removes dict and list elements with None or empty values.


Returns a list with duplicate items removed.

Like list(set(…)), but preserves order.

oauth_dropins.webutil.util.update_scheme(url, handler)[source]

Returns a modified string url with the current request’s scheme.

Useful for converting URLs to https if and only if the current request itself is being served over https.

oauth_dropins.webutil.util.urlopen(url_or_req, *args, **kwargs)[source]

Wraps urllib2.urlopen and logs the HTTP method and URL.


Request handler utility classes.

Includes classes for serving templates with common variables and XRD[S] and JRD files like host-meta and friends.

class oauth_dropins.webutil.handlers.HostMetaHandler(*args, **kwargs)[source]

Bases: oauth_dropins.webutil.handlers.XrdOrJrdHandler

Renders and serves the /.well-known/host-meta file.

class oauth_dropins.webutil.handlers.HostMetaXrdsHandler(*args, **kwargs)[source]

Bases: oauth_dropins.webutil.handlers.TemplateHandler

Renders and serves the /.well-known/host-meta.xrds XRDS-Simple file.


Returns the string content type.


Returns the string template file path.

class oauth_dropins.webutil.handlers.ModernHandler(*args, **kwargs)[source]

Bases: webapp2.RequestHandler

Base handler that adds modern open/secure headers like CORS, HSTS, etc.

class oauth_dropins.webutil.handlers.TemplateHandler(*args, **kwargs)[source]

Bases: oauth_dropins.webutil.handlers.ModernHandler

Renders and serves a template based on class attributes.

Subclasses must override template_file() and may also override template_vars() and content_type().


Returns the string content type.


Returns dict of HTTP response headers. Subclasses may override.

To advertise XRDS, use:

headers['X-XRDS-Location'] = 'https://%s/.well-known/host-meta.xrds' % appengine_config.HOST

Returns the string template file path.


Returns a dict of template variable string keys and values.

class oauth_dropins.webutil.handlers.XrdOrJrdHandler(*args, **kwargs)[source]

Bases: oauth_dropins.webutil.handlers.TemplateHandler

Renders and serves an XRD or JRD file.

JRD is served if the request path ends in .json, or the query parameters include ‘format=json’, or the request headers include ‘Accept: application/json’.

Subclasses must override template_prefix().


Returns the string content type.


Returns True if JRD should be served, False if XRD.


Returns the string template file path.

oauth_dropins.webutil.handlers.handle_exception(self, e, debug)[source]

A webapp2 exception handler that propagates HTTP exceptions into the response.

Use this as a webapp2.RequestHandler handle_exception() method by adding this line to your handler class definition:

handle_exception = handlers.handle_exception

I originally tried to put this in a RequestHandler subclass, but it gave me this exception:

File ".../webapp2-2.5.1/webapp2_extras/local.py", line 136, in _get_current_object
  raise RuntimeError('no object bound to %s' % self.__name__) RuntimeError: no object bound to app

These are probably related: * http://eemyop.blogspot.com/2013/05/digging-around-in-webapp2-finding-out.html * http://code.google.com/p/webapp-improved/source/detail?r=d962ac4625ce3c43a3e59fd7fc07daf8d7b7c46a

oauth_dropins.webutil.handlers.redirect(from_domain, to_domain)[source]

Decorator for RequestHandler methods that 301 redirects to a new domain.

Preserves scheme, path, and query.

  • from_domain – string or sequence of strings
  • to_domain – strings


App Engine datastore model base classes and utilites.

class oauth_dropins.webutil.models.KeyNameModel(*args, **kwargs)[source]

Bases: google.appengine.ext.db.Model

A db model class that requires a key name.

class oauth_dropins.webutil.models.SingleEGModel(*args, **kwargs)[source]

Bases: google.appengine.ext.db.Model

A model class that stores all entities in a single entity group.

All entities use the same parent key (below), and all() automatically adds it as an ancestor. That allows, among other things, fetching all entities of this kind with strong consistency.

classmethod all()[source]

Returns a query over all instances of this model from the datastore.

Returns:Query that will retrieve all instances from entity collection.

Sets the parent keyword arg. If it’s already set, checks that it’s correct.

classmethod get_by_id(*args, **kwargs)[source]

Get instance of Model class by id.

  • key_names – A single id or a list of ids.
  • parent – Parent of instances to get. Can be a model or key.
  • config – datastore_rpc.Configuration to use for this request.
classmethod get_by_key_name(*args, **kwargs)[source]

Get instance of Model class by its key’s name.

  • key_names – A single key-name or a list of key-names.
  • parent – Parent of instances to get. Can be a model or key.
  • config – datastore_rpc.Configuration to use for this request.
classmethod get_or_insert(*args, **kwargs)[source]

Transactionally retrieve or create an instance of Model class.

This acts much like the Python dictionary setdefault() method, where we first try to retrieve a Model instance with the given key name and parent. If it’s not present, then we create a new instance (using the *kwds supplied) and insert that with the supplied key name.

Subsequent calls to this method with the same key_name and parent will always yield the same entity (though not the same actual object instance), regardless of the *kwds supplied. If the specified entity has somehow been deleted separately, then the next call will create a new entity and return it.

If the ‘parent’ keyword argument is supplied, it must be a Model instance. It will be used as the parent of the new instance of this Model class if one is created.

This method is especially useful for having just one unique entity for a specific identifier. Insertion/retrieval is done transactionally, which guarantees uniqueness.

Example usage:

class WikiTopic(db.Model):
creation_date = db.DatetimeProperty(auto_now_add=True) body = db.TextProperty(required=True)

# The first time through we’ll create the new topic. wiki_word = ‘CommonIdioms’ topic = WikiTopic.get_or_insert(wiki_word,

body=’This topic is totally new!’)

assert topic.key().name() == ‘CommonIdioms’ assert topic.body == ‘This topic is totally new!’

# The second time through will just retrieve the entity. overwrite_topic = WikiTopic.get_or_insert(wiki_word,

body=’A totally different message!’)

assert topic.key().name() == ‘CommonIdioms’ assert topic.body == ‘This topic is totally new!’

  • key_name – Key name to retrieve or create.
  • **kwds – Keyword arguments to pass to the constructor of the model class if an instance for the specified key name does not already exist. If an instance with the supplied key_name and parent already exists, the rest of these arguments will be discarded.

Existing instance of Model class with the specified key_name and parent or a new one that has just been created.

  • TransactionFailedError if the specified Model instance could not be
  • retrieved or created transactionally (due to high contention, etc).
classmethod shared_parent_key()[source]

Returns the shared parent key for this class.

It’s not actually an entity, just a placeholder key.

class oauth_dropins.webutil.models.StringIdModel(**kwds)[source]

Bases: google.appengine.ext.ndb.model.Model

An ndb model class that requires a string id.

put(*args, **kwargs)[source]

Raises AssertionError if string id is not provided.


Unit test utilities.

class oauth_dropins.webutil.testutil.HandlerTest(methodName='runTest')[source]

Bases: oauth_dropins.webutil.testutil.TestCase

Base test class for webapp2 request handlers.

Uses App Engine’s testbed to set up API stubs: http://code.google.com/appengine/docs/python/tools/localunittesting.html






Hook method for setting up the test fixture before exercising it.


Hook method for deconstructing the test fixture after testing it.

class oauth_dropins.webutil.testutil.TestCase(methodName='runTest')[source]

Bases: mox.MoxTestBase

Test case class with lots of extra helpers.

assert_entities_equal(a, b, ignore=frozenset([]), keys_only=False, in_order=False)[source]

Asserts that a and b are equivalent entities or lists of entities.

…specifically, that they have the same property values, and if they both have populated keys, that their keys are equal too.

  • b (a,) – db.Model or ndb.Model instances or lists of instances
  • ignore – sequence of strings, property names not to compare
  • keys_only – boolean, if True only compare keys
  • in_order – boolean. If False, all entities must have keys.
assert_equals(expected, actual, msg=None, in_order=False)[source]

Pinpoints individual element differences in lists and dicts.

If in_order is False, ignores order in lists and tuples.

assert_multiline_equals(expected, actual)[source]

Compares two multi-line strings and reports a diff style output.

Ignores leading and trailing whitespace on each line, and squeezes repeated blank lines down to just one.

assert_multiline_in(expected, actual)[source]

Checks that a multi-line string is in another and reports a diff output.

Ignores leading and trailing whitespace on each line, and squeezes repeated blank lines down to just one.


Returns a list of keys for a list of entities.

expect_requests_get(*args, **kwargs)[source]
expect_requests_head(*args, **kwargs)[source]
expect_requests_post(*args, **kwargs)[source]
expect_urlopen(url, response=None, status=200, data=None, headers=None, response_headers={}, **kwargs)[source]

Stubs out urllib2.urlopen() and sets up an expected call.

If status isn’t 2xx, makes the expected call raise a urllib2.HTTPError instead of returning the response.

If data is set, url must be a urllib2.Request.

If response is unset, returns the expected call.

  • url – string, re.RegexObject or urllib2.Request or webob.Request
  • response – string
  • status – int, HTTP response code
  • data – optional string POST body
  • headers – optional expected request header dict
  • response_headers – optional response header dict
  • kwargs – other keyword args, e.g. timeout

Hook method for setting up the test fixture before exercising it.


Automatically return 200 to outgoing HEAD requests.


Mock outgoing HEAD requests so they must be expected individually.

class oauth_dropins.webutil.testutil.UrlopenResult(status_code, content, url=None, headers={})[source]

Bases: object

A fake urllib2.urlopen() or urlfetch.fetch() result object.


Returns a task’s ETA as a datetime.


Parses a task’s POST body and returns the query params in a dict.