oauth_dropins.webutil

Reference documentation.

util

Misc utilities.

class oauth_dropins.webutil.util.Struct(**kwargs)[source]

Bases: object

A generic class that initializes its attributes from constructor kwargs.

__weakref__

list of weak references to the object (if defined)

class oauth_dropins.webutil.util.CacheDict[source]

Bases: dict

A dict that also implements memcache’s get_multi() and set_multi() methods.

Useful as a simple in memory replacement for App Engine’s memcache API for e.g. get_activities_response() in snarfed/activitystreams-unofficial.

__weakref__

list of weak references to the object (if defined)

oauth_dropins.webutil.util.to_xml(value)[source]

Renders a dict (usually from JSON) as an XML snippet.

oauth_dropins.webutil.util.trim_nulls(value)[source]

Recursively removes dict and list elements with None or empty values.

oauth_dropins.webutil.util.uniquify(input)[source]

Returns a list with duplicate items removed.

Like list(set(...)), but preserves order.

oauth_dropins.webutil.util.get_list(dict, key)[source]

Returns a value from a dict as a list.

If the value is a list or tuple, it’s converted to a list. If it’s something else, it’s returned as a single-element list. If the key doesn’t exist, returns [].

oauth_dropins.webutil.util.get_first(dict, key, default=None)[source]

Returns the first element of a dict value.

If the value is a list or tuple, returns the first value. If it’s something else, returns the value itself. If the key doesn’t exist, returns None.

oauth_dropins.webutil.util.tag_uri(domain, name, year=None)[source]

Returns a tag URI string for the given domain and name.

Example return value: ‘tag:twitter.com,2012:snarfed_org/172417043893731329

Background on tag URIs: http://taguri.org/

oauth_dropins.webutil.util.parse_tag_uri(uri)[source]

Returns the domain and name in a tag URI string.

Inverse of tag_uri().

Returns:(string domain, string name) tuple, or None if the tag URI couldn’t be parsed
oauth_dropins.webutil.util.parse_acct_uri(uri, hosts=None)[source]

Parses acct: URIs of the form acct:user@example.com .

Background: http://hueniverse.com/2009/08/making-the-case-for-a-new-acct-uri-scheme/

Parameters:
  • uri – string
  • hosts – sequence of allowed hosts (usually domains). None means allow all.
Returns:

(username, host) tuple

Raises: ValueError if the uri is invalid or the host isn’t allowed.

Extracts and returns the meaningful domain from a URL.

Strips www., mobile., and m. from the beginning of the domain.

Parameters:url – string
Returns:string
oauth_dropins.webutil.util.domain_or_parent_in(input, domains)[source]

Returns True if an input domain or its parent is in a set of domains.

Examples:

  • foo, [] => False
  • foo, [foo] => True
  • foo.bar.com, [bar.com] => True
  • foo.bar.com, [.bar.com] => True
  • foo.bar.com, [fux.bar.com] => False
  • bar.com, [fux.bar.com] => False
Parameters:
  • input – string domain
  • domains – sequence of string domains
Returns:

boolean

oauth_dropins.webutil.util.update_scheme(url, handler)[source]

Returns a modified string url with the current request’s scheme.

Useful for converting URLs to https if and only if the current request itself is being served over https.

oauth_dropins.webutil.util.schemeless(url, slashes=True)[source]

Strips the scheme (e.g. ‘https:’) from a URL.

Parameters:
  • url – string
  • leading_slashes – if False, also strips leading slashes and trailing slash, e.g. ‘http://example.com/‘ becomes ‘example.com’
Returns:

string URL

oauth_dropins.webutil.util.fragmentless(url)[source]

Strips the fragment (e.g. ‘#foo’) from a URL.

Parameters:url – string
Returns:string URL
oauth_dropins.webutil.util.clean_url(url)[source]

Removes transient query params (e.g. utm_*) from a URL.

The utm_* (Urchin Tracking Metrics?) params come from Google Analytics. https://support.google.com/analytics/answer/1033867

The source=rss-... params are on all links in Medium’s RSS feeds.

Parameters:url – string
Returns:string, the cleaned url, or None if it can’t be parsed
oauth_dropins.webutil.util.base_url(url)[source]

Returns the base of a given URL.

For example, returns ‘http://site/posts/‘ for ‘http://site/posts/123‘.

Parameters:url – string

Returns a list of unique string URLs in the given text.

URLs in the returned list are in the order they first appear in the text.

Splits text into link and non-link text.

Parameters:
  • text – string to linkify
  • skip_bare_cc_tlds – boolean, whether to skip links of the form [domain].[2-letter TLD] with no schema and no path
Returns:

a tuple containing two lists of strings, a list of links and list of non-link text. Roughly equivalent to the output of re.findall and re.split, with some post-processing.

oauth_dropins.webutil.util.linkify(text, pretty=False, skip_bare_cc_tlds=False, **kwargs)[source]

Adds HTML links to URLs in the given plain text.

For example: linkify('Hello http://tornadoweb.org!') would return ‘Hello <a href=”http://tornadoweb.org“>http://tornadoweb.org</a>!’

Ignores URLs that are inside HTML links, ie anchor tags that look like <a href=”...”> .

Parameters:
  • text – string, input
  • pretty – if True, uses pretty_link() for link text
Returns:

string, linkified input

Renders a pretty, short HTML link to a URL.

If text is not provided, the link text is the URL without the leading http(s)://[www.], ellipsized at the end if necessary. URL escape characters and UTF-8 are decoded.

The default maximum length follow’s Twitter’s rules: full domain plus 15 characters of path (including leading slash).

Parameters:
  • url – string
  • text – string, optional
  • keep_host – if False, remove the host from the link text
  • glyphicon – string glyphicon to render after the link text, if provided. Details: http://glyphicons.com/
  • attrs – dict of attributes => values to include in the a tag. optional
  • new_tab – boolean, include target=”_blank” if True
  • max_length – int, max link text length in characters. ellipsized beyond this.
Returns:

unicode string HTML snippet with <a> tag

class oauth_dropins.webutil.util.SimpleTzinfo[source]

Bases: datetime.tzinfo

A simple, DST-unaware tzinfo subclass.

__weakref__

list of weak references to the object (if defined)

oauth_dropins.webutil.util.parse_iso8601(str)[source]

Parses an ISO 8601 or RFC 3339 date/time string and returns a datetime.

Time zone designator is optional. If present, the returned datetime will be time zone aware.

Parameters:str – string ISO 8601 or RFC 3339, e.g. ‘2012-07-23T05:54:49+00:00’
Returns:datetime
oauth_dropins.webutil.util.maybe_iso8601_to_rfc3339(input)[source]

Tries to convert an ISO 8601 date/time string to RFC 3339.

The formats are similar, but not identical, eg. RFC 3339 includes a colon in the timezone offset at the end (+0000 instead of +00:00), but ISO 8601 doesn’t.

If the input can’t be parsed as ISO 8601, it’s silently returned, unchanged!

http://www.rfc-editor.org/rfc/rfc3339.txt

oauth_dropins.webutil.util.maybe_timestamp_to_rfc3339(input)[source]

Tries to convert a string or int UNIX timestamp to RFC 3339.

oauth_dropins.webutil.util.to_utc_timestamp(input)[source]

Converts a datetime to a float POSIX timestamp (seconds since epoch).

oauth_dropins.webutil.util.as_utc(input)[source]

Converts a timezone-aware datetime to a naive UTC datetime.

If input is timezone-naive, it’s returned as is.

Doesn’t support DST!

oauth_dropins.webutil.util.ellipsize(str, words=14, chars=140)[source]

Truncates and ellipsizes str if it’s longer than words or chars.

Words are simply tokenized on whitespace, nothing smart.

oauth_dropins.webutil.util.add_query_params(url, params)[source]

Adds new query parameters to a URL. Encodes as UTF-8 and URL-safe.

Parameters:
  • url – string URL or urllib2.Request. May already have query parameters.
  • params – dict or list of (string key, string value) tuples. Keys may repeat.
Returns:

string URL

oauth_dropins.webutil.util.dedupe_urls(urls)[source]

Normalizes and de-dupes http(s) URLs.

Converts domain to lower case, adds trailing slash when path is empty, and ignores scheme (http vs https), preferring https. Preserves order.

Domains are case insensitive, even modern domains with Unicode/punycode characters:

http://unicode.org/faq/idn.html#6 https://tools.ietf.org/html/rfc4343#section-5

As examples, http://foo/ and https://FOO are considered duplicates, but http://foo/bar and http://foo/bar/ aren’t.

Background: https://en.wikipedia.org/wiki/URL_normalization

TODO: port to https://pypi.python.org/pypi/urlnorm

Parameters:urls – sequence of string URLs
Returns:sequence of string URLs
oauth_dropins.webutil.util.if_changed(cache, updates, key, value)[source]

Returns a value if it’s different from the cached value, otherwise None.

Values that evaluate to False are considered equivalent to None, in order to save cache space.

If the values differ, updates[key] is set to value. You can use this to collect changes that should be made to the cache in batch. None values in updates mean that the corresponding key should be deleted.

Parameters:
  • cache – any object with a get(key) method
  • updates – mapping (e.g. dict)
  • key – anything supported by cache
  • value – anything supported by cache
Returns:

value or None

oauth_dropins.webutil.util.generate_secret()[source]

Generates a URL-safe random secret string.

Uses App Engine’s os.urandom(), which is designed to be cryptographically secure: http://code.google.com/p/googleappengine/issues/detail?id=1055

Parameters:bytes – integer, length of string to generate
Returns:random string
oauth_dropins.webutil.util.is_int(arg)[source]

Returns True if arg can be converted to an integer, False otherwise.

oauth_dropins.webutil.util.is_float(arg)[source]

Returns True if arg can be converted to a float, False otherwise.

oauth_dropins.webutil.util.is_base64(arg)[source]

Returns True if arg is a base64 encoded string, False otherwise.

oauth_dropins.webutil.util.interpret_http_exception(exception)[source]

Extracts the status code and response from different HTTP exception types.

Parameters:exception

an HTTP request exception. Supported types:

Returns:(string status code or None, string response body or None)
oauth_dropins.webutil.util.is_connection_failure(exception)[source]

Returns True if the given exception is a network connection failure.

...False otherwise.

class oauth_dropins.webutil.util.FileLimiter(file_obj, read_limit)[source]

Bases: object

A file object wrapper that reads up to a limit and then reports EOF.

From http://stackoverflow.com/a/29838711/186123 . Thanks SO!

__weakref__

list of weak references to the object (if defined)

oauth_dropins.webutil.util.load_file_lines(file)[source]

Reads lines from a file and returns them as a set.

Leading and trailing whitespace is trimmed. Blank lines and lines beginning with # (ie comments) are ignored.

Parameters:file – a file object or other iterable that returns lines
Returns:set of strings
oauth_dropins.webutil.util.urlopen(url_or_req, *args, **kwargs)[source]

Wraps urllib2.urlopen and logs the HTTP method and URL.

oauth_dropins.webutil.util.requests_fn(fn)[source]

Wraps requests.* and logs the HTTP method and URL.

oauth_dropins.webutil.util.follow_redirects(url, cache=None, fail_cache_time_secs=86400, **kwargs)[source]

Fetches a URL with HEAD, repeating if necessary to follow redirects.

Does not raise an exception if any of the HTTP requests fail, just returns the failed response. If you care, be sure to check the returned response’s status code!

Parameters:
  • url – string
  • cache – optional, a cache object to read and write resolved URLs to. Must have get(key) and set(key, value, time=...) methods. Stores ‘R [original URL]’ in key, final URL in value.
  • kwargs – passed to requests.head()
Returns:

the requests.Response for the final request. The url attribute has the

final URL.

class oauth_dropins.webutil.util.UrlCanonicalizer(scheme='https', domain=None, subdomain=None, approve=None, reject=None, query=False, fragment=False, trailing_slash=False, redirects=True, headers=None)[source]

Bases: object

Converts URLs to their canonical form.

If an input URL matches approve or reject, it’s automatically approved as is without following redirects.

If we HEAD the URL to follow redirects and it returns 4xx or 5xx, we return None.

__init__(scheme='https', domain=None, subdomain=None, approve=None, reject=None, query=False, fragment=False, trailing_slash=False, redirects=True, headers=None)[source]

Constructor.

Parameters:
  • scheme – string canonical scheme for this source (default ‘https’)
  • domain – string canonical domain for this source (default None). If set, links on other domains will be rejected without following redirects.
  • subdomain – string canonical subdomain, e.g. ‘www’ (default none, ie root domain). only added if there’s not already a subdomain.
  • approve – string regexp matching URLs that are automatically considered canonical
  • reject – string regexp matching URLs that are automatically considered canonical
  • query – boolean, whether to keep query params, if any (default False)
  • fragment – boolean, whether to keep fragment, if any (default False)
  • slash (trailing) – boolean, whether the path should end in / (default False)
  • redirects – boolean, whether to make HTTP HEAD requests to follow redirects (default True)
  • headers – passed through to the requests.head call for following redirects
__call__(url, redirects=None)[source]

Canonicalizes a string URL.

Returns the canonical form of a string URL, or None if it can’t be canonicalized, ie it’s in the blacklist or its domain doesn’t match.

__weakref__

list of weak references to the object (if defined)

class oauth_dropins.webutil.util.WideUnicode(*args, **kwargs)[source]

Bases: unicode

String class with consistent indexing and len() on narrow and wide Python.

PEP 261 describes that Python 2 builds come in “narrow” and “wide” flavors. Wide is configured with –enable-unicode=ucs4, which represents Unicode high code points above the 16-bit Basic Multilingual Plane in unicode strings as single characters. This means that len(), indexing, and slices of unicode strings use Unicode code points consistently.

Narrow, on the other hand, represents high code points as “surrogate pairs” of 16-bit characters. This means that len(), indexing, and slicing unicode strings does not always correspond to Unicode code points.

Mac OS X, Windows, and older Linux distributions have narrow Python 2 builds, while many modern Linux distributions have wide builds, so this can cause platform-specific bugs, e.g. with many commonly used emoji.

Docs: https://www.python.org/dev/peps/pep-0261/ https://docs.python.org/2.7/library/codecs.html?highlight=ucs2#encodings-and-unicode http://www.unicode.org/glossary/#high_surrogate_code_point

Inspired by: http://stackoverflow.com/a/9934913

Related work: https://uniseg-python.readthedocs.io/ https://pypi.python.org/pypi/pytextseg https://github.com/LuminosoInsight/python-ftfy/ https://github.com/PythonCharmers/python-future/issues/116 https://dev.twitter.com/basics/counting-characters

On StackOverflow: http://stackoverflow.com/questions/1446347/how-to-find-out-if-python-is-compiled-with-ucs-2-or-ucs-4 http://stackoverflow.com/questions/12907022/python-getting-correct-string-length-when-it-contains-surrogate-pairs http://stackoverflow.com/questions/35404144/correctly-extract-emojis-from-a-unicode-string

__weakref__

list of weak references to the object (if defined)

handlers

Request handler utility classes.

Includes classes for serving templates with common variables and XRD[S] and JRD files like host-meta and friends.

oauth_dropins.webutil.handlers.handle_exception(self, e, debug)[source]

A webapp2 exception handler that propagates HTTP exceptions into the response.

Use this as a webapp2.RequestHandler.handle_exception() method by adding this line to your handler class definition:

handle_exception = handlers.handle_exception

I originally tried to put this in a webapp2.RequestHandler subclass, but it gave me this exception:

File ".../webapp2-2.5.1/webapp2_extras/local.py", line 136, in _get_current_object
  raise RuntimeError('no object bound to %s' % self.__name__) RuntimeError: no object bound to app

These are probably related:

oauth_dropins.webutil.handlers.redirect(from_domain, to_domain)[source]

webapp2.RequestHandler decorator that 301 redirects to a new domain.

Preserves scheme, path, and query.

Parameters:
  • from_domain – string or sequence of strings
  • to_domain – strings
oauth_dropins.webutil.handlers.memcache_response(expiration)[source]

webapp2.RequestHandler decorator that memcaches the response.

Parameters:expirationdatetime.timedelta
class oauth_dropins.webutil.handlers.ModernHandler(*args, **kwargs)[source]

Bases: webapp2.RequestHandler

Base handler that adds modern open/secure headers like CORS, HSTS, etc.

class oauth_dropins.webutil.handlers.TemplateHandler(*args, **kwargs)[source]

Bases: oauth_dropins.webutil.handlers.ModernHandler

Renders and serves a template based on class attributes.

Subclasses must override template_file() and may also override template_vars() and content_type().

template_file()[source]

Returns the string template file path.

template_vars()[source]

Returns a dict of template variable string keys and values.

content_type()[source]

Returns the string content type.

headers()[source]

Returns dict of HTTP response headers. Subclasses may override.

To advertise XRDS, use:

headers['X-XRDS-Location'] = 'https://%s/.well-known/host-meta.xrds' % appengine_config.HOST
class oauth_dropins.webutil.handlers.XrdOrJrdHandler(*args, **kwargs)[source]

Bases: oauth_dropins.webutil.handlers.TemplateHandler

Renders and serves an XRD or JRD file.

JRD is served if the request path ends in .json, or the query parameters include ‘format=json’, or the request headers include ‘Accept: application/json’.

Subclasses must override template_prefix().

template_prefix()[source]

Returns template filename, without extension.

is_jrd()[source]

Returns True if JRD should be served, False if XRD.

class oauth_dropins.webutil.handlers.HostMetaHandler(*args, **kwargs)[source]

Bases: oauth_dropins.webutil.handlers.XrdOrJrdHandler

Renders and serves the /.well-known/host-meta file.

class oauth_dropins.webutil.handlers.HostMetaXrdsHandler(*args, **kwargs)[source]

Bases: oauth_dropins.webutil.handlers.TemplateHandler

Renders and serves the /.well-known/host-meta.xrds XRDS-Simple file.

models

App Engine datastore model base classes and utilites.

class oauth_dropins.webutil.models.StringIdModel(*args, **kwds)[source]

Bases: google.appengine.ext.ndb.model.Model

An ndb model class that requires a string id.

put(*args, **kwargs)[source]

Raises AssertionError if string id is not provided.

class oauth_dropins.webutil.models.KeyNameModel(*args, **kwargs)[source]

Bases: google.appengine.ext.db.Model

A db model class that requires a key name.

__init__(*args, **kwargs)[source]

Raises AssertionError if key name is not provided.

class oauth_dropins.webutil.models.SingleEGModel(self_or_cls, *args, **kwargs)[source]

Bases: google.appengine.ext.db.Model

A model class that stores all entities in a single entity group.

All entities use the same parent key (below), and all() automatically adds it as an ancestor. That allows, among other things, fetching all entities of this kind with strong consistency.

enforce_parent(fn)[source]

Sets the parent keyword arg. If it’s already set, checks that it’s correct.

classmethod shared_parent_key()[source]

Returns the shared parent key for this class.

It’s not actually an entity, just a placeholder key.

testutil

Unit test utilities.

oauth_dropins.webutil.testutil.get_task_params(task)[source]

Parses a task’s POST body and returns the query params in a dict.

oauth_dropins.webutil.testutil.get_task_eta(task)[source]

Returns a task’s ETA as a datetime.

class oauth_dropins.webutil.testutil.UrlopenResult(status_code, content, url=None, headers={})[source]

Bases: object

A fake urllib2.urlopen() or urlfetch.fetch() result object.

__weakref__

list of weak references to the object (if defined)

class oauth_dropins.webutil.testutil.TestCase(methodName='runTest')[source]

Bases: mox.MoxTestBase

Test case class with lots of extra helpers.

stub_requests_head()[source]

Automatically return 200 to outgoing HEAD requests.

unstub_requests_head()[source]

Mock outgoing HEAD requests so they must be expected individually.

expect_urlopen(url, response=None, status=200, data=None, headers=None, response_headers={}, **kwargs)[source]

Stubs out urllib2.urlopen() and sets up an expected call.

If status isn’t 2xx, makes the expected call raise a urllib2.HTTPError instead of returning the response.

If data is set, url must be a urllib2.Request.

If response is unset, returns the expected call.

Parameters:
  • url – string, re.RegexObject or urllib2.Request or webob.request.Request
  • response – string
  • status – int, HTTP response code
  • data – optional string POST body
  • headers – optional expected request header dict
  • response_headers – optional response header dict
  • kwargs – other keyword args, e.g. timeout
assert_entities_equal(a, b, ignore=frozenset([]), keys_only=False, in_order=False)[source]

Asserts that a and b are equivalent entities or lists of entities.

...specifically, that they have the same property values, and if they both have populated keys, that their keys are equal too.

Parameters:
  • adb.Model or ndb.Model instances or lists of instances
  • b – same
  • ignore – sequence of strings, property names not to compare
  • keys_only – boolean, if True only compare keys
  • in_order – boolean. If False, all entities must have keys.
entity_keys(entities)[source]

Returns a list of keys for a list of entities.

assert_equals(expected, actual, msg=None, in_order=False)[source]

Pinpoints individual element differences in lists and dicts.

If in_order is False, ignores order in lists and tuples.

assert_multiline_equals(expected, actual)[source]

Compares two multi-line strings and reports a diff style output.

Ignores leading and trailing whitespace on each line, and squeezes repeated blank lines down to just one.

assert_multiline_in(expected, actual)[source]

Checks that a multi-line string is in another and reports a diff output.

Ignores leading and trailing whitespace on each line, and squeezes repeated blank lines down to just one.

class oauth_dropins.webutil.testutil.HandlerTest(methodName='runTest')[source]

Bases: oauth_dropins.webutil.testutil.TestCase

Base test class for webapp2 request handlers.

Uses App Engine’s testbed to set up API stubs: http://code.google.com/appengine/docs/python/tools/localunittesting.html

application

webapp2.WSGIApplication

handler

webapp2.RequestHandler