Skip to content

Commit

Permalink
Release v0.6.1
Browse files Browse the repository at this point in the history
Merge branch 'develop'
  • Loading branch information
goodmami committed Mar 13, 2017
2 parents b189e9b + 47a27a9 commit e1c79a3
Show file tree
Hide file tree
Showing 6 changed files with 188 additions and 24 deletions.
20 changes: 19 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,29 @@

## [Unreleased][unreleased]

### Fixed
### Added
### Removed
### Fixed
### Changed
### Deprecated

## [v0.6.1][v0.6.1]

### Added

* Some additional regular expressions on `PENMANCodec` to influence
parsing behavior
* `CONTRIBUTING.md`

### Fixed

* Allow numeric and string variables and node types

### Changed

* Grammar in README now more accurately reflect parsing behavior (and
vice versa)

## [v0.6.0][]

### Fixed
Expand Down Expand Up @@ -183,4 +200,5 @@ First release with very basic functionality.
[v0.5.0]: ../../releases/tag/v0.5.0
[v0.5.1]: ../../releases/tag/v0.5.1
[v0.6.0]: ../../releases/tag/v0.6.0
[v0.6.1]: ../../releases/tag/v0.6.1
[README]: README.md
31 changes: 31 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@

# How to contribute

Thanks for getting involved in Penman's development!

### Reporting bugs and requesting features

Please report bugs or feature requests on the
[issues](https://github.com/goodmami/penman/issues) page.
Mention the version numbers of Penman and Python that you are using.
Also report inaccurate or missing documentation from the
[API docs](http://goodmami.github.io/penman/docs/API).

### Contributing code

If you wish to contribute code to Penman, please fork the repository to
your own account, commit your changes, and submit a
[pull request](https://github.com/goodmami/penman/compare/) to the
`develop` branch.

Please follow [PEP8](python.org/dev/peps/pep-0008/) unless you have a
good reason not to, and also try to follow the conventions set by the
Penman codebase.

I also try to follow this branching model:
http://nvie.com/posts/a-successful-git-branching-model/

Basically, each new changeset (e.g. features or bug fixes) should have
its own branch. Changeset branches (except critical bug fixes) get
merged to the develop branch, and develop gets merged back to master
when a new release is ready.
14 changes: 8 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ the example [below](#library-usage).
(d / dog
:ARG0-of (b / bark))
>>> print(penman.encode(g, indent=False))
(b / bark :ARG0 (d / dog))```
(b / bark :ARG0 (d / dog))
```

### Script Usage
Expand Down Expand Up @@ -103,12 +103,13 @@ NodeData <- Variable ('/' NodeType)? Edge*
NodeType <- Atom
Variable <- Atom
Edge <- Relation Value
Relation <- /:[^\s(]*/
Value <- Node | String | Float | Integer | Atom
Relation <- /:[^\s(),]*/
Value <- Node | Atom
Atom <- String | Float | Integer | Symbol
String <- /"[^"\\]*(?:\\.[^"\\]*)*"/
Atom <- /[^\s)\/]+/
Float <- /[-+]?(0|[1-9]\d*)(\.\d+[eE][-+]?|\.|[eE][-+]?)\d+/
Float <- /[-+]?(((\d+\.\d*|\.\d+)([eE][-+]?\d+)?)|\d+[eE][-+]?\d+)/
Integer <- /[-+]?\d+/
Symbol <- /[^\s()\/,]+/
```

\* *Note: I use `|` above for ordered-choice instead of `/` so that `/`
Expand All @@ -120,7 +121,7 @@ could be given as a disjunction of allowed names. Similarly, Relations
could be a disjunction of allowed names and possible inversions, or
otherwise require at least one character after `:`. It might also
restrict Variables to a form like `/[a-z]+\d*/` and also restrict Atom
values in some way. The included `AMRCodec` employs most of these
values in some way. The included [AMRCodec][] employs most of these
restrictions and raises `DecodeError`s for graphs it deems invalid. See
also [Nathan Schneider's PEG for AMR](https://github.com/nschneid/amr-hackathon/blob/master/src/amr.peg).

Expand All @@ -137,6 +138,7 @@ This project is not affiliated with [ISI], the [PENMAN] project, or the

[documentation]: docs/API.md
[PENMANCodec]: docs/API.md#penmancodec
[AMRCodec]: docs/API.md#amrcodec
[encode(g)]: docs/API.md#encode
[decode(s)]: docs/API.md#decode
[load(f)]: docs/API.md#load
Expand Down
65 changes: 49 additions & 16 deletions penman.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,14 +34,16 @@
# API overview:
#
# Classes:
# * PENMANCodec(indent=True)
# * PENMANCodec(indent=True, relation_sort=original_order)
# - PENMANCodec.decode(s)
# - PENMANCodec.iterdecode(s)
# - PENMANCodec.encode(g, top=None)
# - PENMANCodec.is_relation_inverted(relation)
# - PENMANCodec.invert_relation(relation)
# - PENMANCodec.handle_triple(source, relation, target)
# - PENMANCodec.triples_to_graph(triples, top=None)
# * AMRCodec(indent=True, relation_sort=original_order)
# - (methods are the same as PENMANCodec)
# * Triple(source, relation, target)
# * Graph(data=None, top=None)
# - Graph.top
Expand All @@ -57,6 +59,9 @@
# * loads(string, triples=False, cls=PENMANCodec, **kwargs)
# * dump(graphs, file, triples=False, cls=PENMANCodec, **kwargs)
# * dumps(graphs, triples=False, cls=PENMANCodec, **kwargs)
# * original_order(triples)
# * out_first_order(triples)
# * alphanum_order(triples)

import re
from collections import namedtuple, defaultdict
Expand All @@ -65,7 +70,7 @@
except NameError:
basestring = str

__version__ = '0.6.0'
__version__ = '0.6.1'
__version_info__ = [
int(x) if x.isdigit() else x
for x in re.findall(r'[0-9]+|[^0-9\.-]+', __version__)
Expand Down Expand Up @@ -110,12 +115,22 @@ class PENMANCodec(object):
TYPE_REL = 'instance'
TOP_VAR = None
TOP_REL = 'top'
NODE_ENTER_RE = re.compile(r'\s*(\()\s*([^\s()\/,]+)\s*')
NODE_ENTER_RE = re.compile(r'\s*(\()\s*')
NODE_EXIT_RE = re.compile(r'\s*(\))\s*')
RELATION_RE = re.compile(r'(:[^\s(),]*)\s*')
ATOM_RE = re.compile(r'\s*([^\s()\/,]+)\s*')
STRING_RE = re.compile(r'("[^"\\]*(?:\\.[^"\\]*)*")\s*')
INT_RE = re.compile(r'[+-]?\d+')
FLOAT_RE = re.compile(
r'[-+]?(((\d+\.\d*|\.\d+)([eE][-+]?\d+)?)|\d+[eE][-+]?\d+)'
)
ATOM_RE = re.compile(r'([^\s()\/,]+)')
STRING_RE = re.compile(r'("[^"\\]*(?:\\.[^"\\]*)*")')
VAR_RE = re.compile(
'({}|{}|{}|{})'.format(STRING_RE.pattern, FLOAT_RE.pattern,
INT_RE.pattern, ATOM_RE.pattern)
)
NODETYPE_RE = VAR_RE # default; allow strings, numbers, and symbols
COMMA_RE = re.compile(r'\s*,\s*')
SPACING_RE = re.compile(r'\s*')

def __init__(self, indent=True, relation_sort=original_order):
"""
Expand Down Expand Up @@ -324,23 +339,35 @@ def _decode_triple_conjunction(self, s, pos=0):
if start is None:
start = m.start(1)
pos, rel = m.end(0), m.group(1)
m = _regex(self.NODE_ENTER_RE, s, pos, '"(" and a variable')
pos, var = m.end(0), m.group(2)

m = _regex(self.NODE_ENTER_RE, s, pos, '"("')
pos = m.end(0)

m = _regex(self.VAR_RE, s, pos, "a variable (node identifier)")
pos, var = m.end(0), m.group(1).strip()

m = _regex(self.COMMA_RE, s, pos, '","')
pos = m.end(0)
if s[pos] == '"':
m = _regex(self.STRING_RE, s, pos, 'a quoted string')

if rel == self.TYPE_REL:
m = _regex(self.NODETYPE_RE, s, pos, 'a node type')
else:
m = _regex(self.ATOM_RE, s, pos, 'a float/int/atom')
if s[pos] == '"':
m = _regex(self.STRING_RE, s, pos, 'a quoted string')
else:
m = _regex(self.ATOM_RE, s, pos, 'a float/int/symbol')
pos, tgt = m.end(0), m.group(1)

if var == self.TOP_VAR and rel == self.TOP_REL:
top = tgt
elif rel == self.TYPE_REL:
nodes.append((var, rel, tgt))
else:
edges.append((var, rel, tgt))

m = _regex(self.NODE_EXIT_RE, s, pos, '")"')
pos = m.end(1)

if m.end(0) < len(s) and s[m.end(0)] == '^':
pos = m.end(0) + 1
else:
Expand All @@ -353,15 +380,19 @@ def _decode_penman_node(self, s, pos=0):
nodes, edges = [], []

strlen = len(s)
m = _regex(self.NODE_ENTER_RE, s, pos, '"(" and a variable')
start, pos, var = m.start(1), m.end(0), m.group(2)
m = _regex(self.NODE_ENTER_RE, s, pos, '"("')
start, pos = m.start(1), m.end(0)

m = _regex(self.VAR_RE, s, pos, "a variable (node identifier)")
pos, var = m.end(0), m.group(1).strip()

nodetype = None
while pos < strlen and s[pos] != ')':

# node type
if s[pos] == '/':
m = _regex(self.ATOM_RE, s, pos+1, 'a node type')
pos = self.SPACING_RE.match(s, pos=pos+1).end()
m = _regex(self.NODETYPE_RE, s, pos, 'a node type')
pos, nodetype = m.end(0), m.group(1)

# relation
Expand All @@ -384,7 +415,7 @@ def _decode_penman_node(self, s, pos=0):
m = _regex(self.STRING_RE, s, pos, 'a quoted string')
pos, value = m.end(0), m.group(1)
else:
m = _regex(self.ATOM_RE, s, pos, 'a float/int/atom')
m = _regex(self.ATOM_RE, s, pos, 'a float/int/symbol')
pos, value = m.end(0), m.group(1)
edges.append((var, rel, value))

Expand Down Expand Up @@ -482,7 +513,8 @@ def _layout(self, g, src, offset, seen):
for t in outedges:
if t.relation == self.TYPE_REL:
if t.target is not None:
branches = ['/ ' + t.target] + branches # always first
# node types always come first
branches = ['/ {}'.format(t.target)] + branches
else:
if t.inverted:
tgt = t.source
Expand Down Expand Up @@ -519,7 +551,8 @@ class AMRCodec(PENMANCodec):
TOP_VAR = None
TOP_REL = 'top'
# vars: [a-z]+\d* ; first relation must be node type
NODE_ENTER_RE = re.compile(r'\s*(\()\s*([a-z]+\d*)\s*(?=\/)')
NODE_ENTER_RE = re.compile(r'\s*(\()\s*(?=[a-z]+\d*\s*\/)')
VAR_RE = re.compile(r'([a-z]+\d*)')
# only non-anonymous relations
RELATION_RE = re.compile(r'(:[^\s(),]+)\s*')

Expand Down
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@

setup(
name='Penman',
version='0.6.0',
version='0.6.1',
description='PENMAN notation for graphs (e.g. AMR).',
long_description=long_description,
url='https://github.com/goodmami/penman',
Expand Down
80 changes: 80 additions & 0 deletions tests/test_penman.py
Original file line number Diff line number Diff line change
Expand Up @@ -116,6 +116,34 @@ def test_decode(x1, x2):
('a', 'ARG', 15)
]

# numeric node type
g = decode('(one / 1)')
assert g.triples() == [
('one', 'instance', 1)
]

# string node type
g = decode('(one / "a string")')
assert g.triples() == [
('one', 'instance', '"a string"')
]

# numeric "variable"
g = decode('(1 / one)')
assert g.triples() == [
(1, 'instance', 'one')
]
g = decode('(1.1 / one)')
assert g.triples() == [
(1.1, 'instance', 'one')
]

# string "variable"
g = decode('("a string" / one)')
assert g.triples() == [
('"a string"', 'instance', 'one')
]

# fuller example
assert decode(x1[0]).triples() == x1[1]
assert decode(x2[0]).triples() == x2[1]
Expand Down Expand Up @@ -228,6 +256,34 @@ def test_encode(x1, x2):
])
assert encode(g) == '(a :ARG 15)'

# numeric node type
g = penman.Graph([
('one', 'instance', 1)
])
assert encode(g) == '(one / 1)'

# string node type
g = penman.Graph([
('one', 'instance', '"a string"')
])
assert encode(g) == '(one / "a string")'

# numeric "variable"
g = penman.Graph([
(1, 'instance', 'one')
])
assert encode(g) == '(1 / one)'
g = penman.Graph([
(1.1, 'instance', 'one')
])
assert encode(g) == '(1.1 / one)'

# string "variable"
g = penman.Graph([
('"a string"', 'instance', 'one')
])
assert encode(g) == '("a string" / one)'

assert encode(penman.Graph(x1[1])) == x1[0]
assert encode(penman.Graph(x2[1])) == x2[0]

Expand Down Expand Up @@ -481,6 +537,15 @@ def test_loads_triples():
assert len(gs) == 1
assert gs[0].triples() == [('a', 'instance', 'alpha'), ('a', 'ARG', 'b')]

gs = penman.loads('instance(1, alpha)', triples=True)
assert gs[0].triples() == [(1, 'instance', 'alpha')]

gs = penman.loads('instance(1.1, alpha)', triples=True)
assert gs[0].triples() == [(1.1, 'instance', 'alpha')]

gs = penman.loads('instance("a string", alpha)', triples=True)
assert gs[0].triples() == [('"a string"', 'instance', 'alpha')]

class TestCodec(penman.PENMANCodec):
TYPE_REL = 'test'
TOP_VAR = 'TOP'
Expand Down Expand Up @@ -524,6 +589,21 @@ def test_dumps_triples():
triples=True
) == 'instance(a, None) ^\nARG(a, b)'

gs = penman.dumps(
[penman.Graph([(1, 'instance', 'alpha')])],
triples=True
) == 'instance(1, alpha)'

gs = penman.dumps(
[penman.Graph([(1.1, 'instance', 'alpha')])],
triples=True
) == 'instance(1.1, alpha)'

gs = penman.dumps(
[penman.Graph([('"a string"', 'instance', 'alpha')])],
triples=True
) == 'instance("a string", alpha)'

class TestCodec(penman.PENMANCodec):
TYPE_REL = 'test'
TOP_VAR = 'TOP'
Expand Down

0 comments on commit e1c79a3

Please sign in to comment.