RegularExpression

Module providing routines to check and convert between regular and file expressions.

Utility module fused by ClearMap.Utils.TagExpression.

class PatternInverter(groups=None)[source]

Bases: object

generate_any(pattern)[source]
generate_assert(pattern)[source]
generate_assert_not(pattern)[source]
generate_at(pattern)[source]
generate_branch(pattern)[source]
generate_group(pattern, group_name)[source]
generate_group_ref(pattern)[source]
generate_in(pattern)[source]
generate_literal(pattern)[source]
generate_max_repeat(pattern)[source]
generate_min_repeat(pattern)[source]
generate_negate(pattern)[source]
generate_not_literal(pattern)[source]
generate_range(pattern)[source]
generate_re(pattern)[source]
generate_re_type(pattern)[source]
generate_subpattern(pattern)[source]
at_categories = {AT_BEGINNING: '^', AT_BEGINNING_STRING: '\\A', AT_BOUNDARY: '\\b', AT_NON_BOUNDARY: '\\B', AT_END: '$', AT_END_STRING: '\\Z'}
escapes = {7: '\\a', 8: '\\b', 9: '\\t', 10: '\\n', 11: '\\v', 12: '\\f', 13: '\\r', 92: '\\'}
in_categories = {CATEGORY_DIGIT: '\\d', CATEGORY_NOT_DIGIT: '\\D', CATEGORY_SPACE: '\\s', CATEGORY_NOT_SPACE: '\\S', CATEGORY_WORD: '\\w', CATEGORY_NOT_WORD: '\\W'}
expression_to_glob(expression, replace=None, default='*', ignore='.[]')[source]

Converts a regular expression to a glob expression, e.g. to search for files

Arguments

expressionstr

The regular expression.

replacedict, all or None

A dictionary specifying how to replace specific groups. If all or None, all groups are replaced with the default.

ignorelist of chars

Ignore these special chars in the regular expression.

Returns

expressionstr

The regular expression in glob form.

expression_to_pattern(expression, ignore=None)[source]

Convert a regular expression to a parsed pattern for manipulation

Arguments

expressionstr

The regular expression to convert.

Returns

patternlist

The parsed pattern of the regular expression.

format_expression(expression, ignore=None)[source]

Inserts escapes infront of certain regular expression symbols.

Arguments

expressionstr

The regulsr expresion.

ignorelist of chars

A list of characters to ignore as regular expressions commands.

Returns

expressionstr

The regular expression with escaped characters that are ignored.

glob_to_expression(expression, groups=None, to_group='*')[source]

Converts a glob expression to a regular expression

Arguments

expressionstr

A glob expression.

groupsdict or None

A dicitonary specifying how to name groups in the form {id : name}

to_grouplist of chars or None

Glob placeholders to convert to a group.

Returns

expressionstr

The regular expression.

group_dict(expression, value, as_types=[<class 'int'>, <class 'float'>])[source]

Returns a dictionary with the values of the groups in the regular expression that match the value string.

Arguments

expressionstring

The regular expression.

valuestring

The text to match and extract group values from.

as_typeslist of types

List of types to try to convert the extracted group value to.

Returns

valuesdict

The values for each group item.

group_names(expression)[source]

Returns the names of groups in the regular expression

Arguments

expressionstr

The regular expression.

Returns

nameslist of str

The group names in the regular expression sorted according to appearance.

insert_group_names(expression, groups=None, ignore=None)[source]

Inserts group names into a regular expression for spcified groups.

Arguments

expressionstr

The regular expression.

groupsdict or None

A dictionary specifying the group names as {groupid : groupname}.

Returns

expressionstr

The regular expression with named groups.

is_expression(expression, group_names=None, n_patterns=None, ignore=None, exclude=None, verbose=False)[source]

Checks if the regular expression fullfills certain criteria

Arguments

expressionstr

The regular expression to check.

group_nameslist of str or None

List of group names expected to be present in the expression.

n_patternsint or None

Number of group patterns to expect. If negative, the expression is expted to have at least this number of groups.

ignorelist of chars or None

Optional list of chars that should not be regarded as a regular expression command. Useful for filenames setting ignore = [‘.’].

excludelist of str or None

Exculde these tokens when counting groups.

verbosebool

If True, print reason for expression to not fullfil desired criteria.

Returns

is_expressionbool

Returns True if the expression fullfills the desired criteria.

n_groups(expression)[source]

Returns the number of groups in the expression.

Arguments

expressionstr

The regular expression.

Returns

nint

The number of groups in the epxression.

pattern_to_expression(pattern)[source]

Convert a pattern to regular expression

Arguments

patternlist

The regular expression in pattern form.

Returns

expressionstr

The regular expression.

replace(expression, replace=None, ignore=None)[source]

Replaces patterns in a regular expression with given strings

Arguments

expressionstr

The regular expression.

replacedict

The replacements to do in the regular expression given as {pos : str} or {groupname : str}.

ignorelist or chars

Ignore certain regular expression commands.

Returns

replacedstr

The regular expression with replacements.

subpatterns_to_groups(expression, ignore=None, exclude=None, group_names=None)[source]

Replaces subpatterns with groups in a regular expression.

Arguments

expressionstr

The regular expression to check.

ignorelist of chars or None

Optional list of chars that should not be regarded as a regular expression command. Useful for filenames setting ignore = [‘.’].

excludelist of str or None

Exculde these tokens when counting groups.

group_nameslist of str

The group names to use for the new groups.

Returns

expressionstr

The regular expression with subpatterns replaced as groups.