/[volute]/trunk/projects/semantics/vocabularies/convert.py
ViewVC logotype

Contents of /trunk/projects/semantics/vocabularies/convert.py

Parent Directory Parent Directory | Revision Log Revision Log


Revision 5327 - (show annotations)
Fri Mar 1 10:31:00 2019 UTC (2 years, 4 months ago) by msdemlei
File MIME type: text/x-python
File size: 26215 byte(s)
vocabularies: Uniquing SKOS concepts in HTML formatting.

Also, some improvement in theory vocabulary metadata.

Also, non-ASCII in vocabs.conf is now allowed.


1 #!/usr/bin/env python
2
3 """
4 A little script to convert the CSV input format to various outputs.
5
6 Dependencies: python2, rapper, skosify
7
8 This is re-using ideas of Norman Gray, applied to the datalink vocabulary,
9 but adapted for the present case with multiple vocabularies.
10
11 We work from a configuration file for vocabularies. It is written in
12 ini style, where each section corresponds to a vocabulary. All items
13 are mandatory. Here's how to configure a vocabulary myterms::
14
15 [myterms]
16 timestamp: 2016-08-17
17 title: My terms as an example
18 description: This is a collection of terms not actually used
19 anywhere. But then, the CSV we're referencing in a moment
20 doesn't exist either.
21 authors: John Doe; Fred Flintstone
22
23 The actual terms are expected in a file <section name>/terms.csv (in the
24 example, this would be myterms/terms.csv; if you absolutely have to,
25 you can override this using a filename line, but let's avoid that). It must be
26 a CSV file with the following columns:
27
28 predicate; level; label; description; synonym
29
30 level is 1 for "root" terms, 2 for child terms, etc.
31 synonym, is given, references the "canonical" term for the concept.
32 synonym can be left out. Note that we use the semicolon as the
33 delimiter because description frequently has commas in it and we don't
34 want to do too much quoting. Non-ASCII is allowed in label and description;
35 files must be in UTF-8.
36
37 An alternative, SKOS-based approach is used by the theory IG. It should
38 not be used for new projects.
39
40 This program is in the public domain.
41
42 In case of problems, please contact Markus Demleitner
43 <msdemlei@ari.uni-heidelberg.de>
44 """
45
46 from ConfigParser import ConfigParser
47 from xml.etree import ElementTree as etree
48
49 import contextlib
50 import csv
51 import itertools
52 import os
53 import re
54 import subprocess
55 import textwrap
56 import sys
57 import urlparse
58
59
60 MANDATORY_KEYS = frozenset([
61 "timestamp", "title", "description", "authors"])
62
63 HT_ACCESS_TEMPLATE_CSV = """# .htaccess for content negotiation
64
65 # This file is patterned after Recipe 3 in the W3C document 'Best
66 # Practice Recipes for Publishing RDF Vocabularies', at
67 # <http://www.w3.org/TR/swbp-vocab-pub/>
68
69 AddType application/rdf+xml .rdf
70 AddType text/turtle .ttl
71 AddCharset UTF-8 .ttl
72 AddCharset UTF-8 .html
73
74 RewriteEngine On
75 RewriteBase {install_base}
76
77 RewriteCond %{{HTTP_ACCEPT}} application/rdf\+xml
78 RewriteRule ^$ {timestamp}/{name}.rdf [R=303]
79
80 RewriteCond %{{HTTP_ACCEPT}} text/turtle
81 RewriteRule ^$ {timestamp}/{name}.ttl [R=303]
82
83 # No accept conditions: make the .html version the default
84 RewriteRule ^$ {timestamp}/{name}.html [R=303]
85 """
86
87
88 HT_ACCESS_TEMPLATE_SKOS = """# .htaccess for content negotiation
89
90 # This file is patterned after Recipe 3 in the W3C document 'Best
91 # Practice Recipes for Publishing RDF Vocabularies', at
92 # <http://www.w3.org/TR/swbp-vocab-pub/>
93
94 AddType text/skos .skos
95 AddCharset UTF-8 .skos
96 AddCharset UTF-8 .html
97
98 RewriteEngine On
99 RewriteBase {install_base}
100
101 RewriteCond %{{HTTP_ACCEPT}} text.skos
102 RewriteRule ^$ {timestamp}/{name}.skos [R=303]
103
104 # No accept conditions: make the .html version the default
105 RewriteRule ^$ {timestamp}/{name}.html [R=303]
106 """
107
108
109
110 TTL_HEADER_TEMPLATE = """@base {baseuri}.
111 @prefix : <#>.
112
113 @prefix dc: <http://purl.org/dc/terms/> .
114 @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
115 @prefix owl: <http://www.w3.org/2002/07/owl#> .
116 @prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
117 @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
118 @prefix foaf: <http://xmlns.com/foaf/0.1/>.
119
120 <> a owl:Ontology;
121 dc:created {timestamp};
122 dc:creator {creators};
123 rdfs:label {title}@en;
124 dc:title {title}@en;
125 dc:description {description}.
126
127 dc:created a owl:AnnotationProperty.
128 dc:creator a owl:AnnotationProperty.
129 dc:title a owl:AnnotationProperty.
130 dc:description a owl:AnnotationProperty.
131 """
132
133
134 JAVASCRIPT = """
135 current_highlight = null;
136
137 function highlight_fragment(ev) {
138 var parts = document.URL.split("#");
139 if (parts.length==2) {
140 if (current_highlight) {
141 var oldEl = document.getElementById(current_highlight);
142 oldEl.setAttribute("style", "");
143 }
144 current_highlight = parts[parts.length-1];
145 var el = document.getElementById(current_highlight);
146 el.setAttribute("style", "border: 2pt solid yellow");
147 }
148 }
149
150 window.addEventListener("load", highlight_fragment);
151 window.addEventListener("hashchange", highlight_fragment);
152 """
153
154 CSS_STYLE = """
155 html {
156 font-family: sans;
157 }
158
159 h1 {
160 margin-bottom: 3ex;
161 border-bottom: 2pt solid #ccc;
162 }
163
164 tr {
165 padding-top: 2pt;
166 padding-bottom: 2pt;
167 border-bottom: 1pt solid #ccc;
168 }
169
170 thead tr {
171 border-top: 1pt solid black;
172 border-bottom: 1pt solid black;
173 }
174
175 th {
176 padding: 4pt;
177 }
178
179 .intro {
180 max-width: 30em;
181 margin-bottom: 5ex;
182 margin-left: 2ex;
183 }
184
185 .outro {
186 max-width: 30em;
187 margin-top: 4ex;
188 }
189
190 table {
191 border-collapse: collapse;
192 border-bottom: 1pt solid black;
193 }
194
195 td {
196 vertical-align: top;
197 padding: 2pt;
198 }
199
200 th:nth-child(1),
201 td:nth-child(1) {
202 background: #eef;
203 }
204
205 th:nth-child(3),
206 td:nth-child(3) {
207 background: #eef;
208 max-width: 20em;
209 }
210
211 .draftwarning {
212 border-left: 3pt solid red;
213 padding-left: 6pt;
214 }
215
216 ul.compactlist {
217 list-style-type: none;
218 padding-left: 0pt;
219 }
220
221 ul.compactlist li {
222 margin-bottom: 0.3ex;
223 }
224 """
225
226
227 class ReportableError(Exception):
228 """is raised for expected and explainable error conditions.
229
230 All other exceptions lead to tracbacks for further debugging.
231 """
232
233
234 ############ some utility functions
235
236 @contextlib.contextmanager
237 def work_dir(dir_name):
238 """a context manager for temporarily working in dir_name.
239
240 dir_name, if non-existing, is created.
241 """
242 if not os.path.isdir(dir_name):
243 os.makedirs(dir_name)
244 owd = os.getcwd()
245 os.chdir(dir_name)
246 try:
247 yield
248 finally:
249 os.chdir(owd)
250
251
252 def is_URI(s):
253 """returns True if we believe s is a URI.
254
255 This is a simple, RE-based heuristic.
256 """
257 return bool(re.match("[a-zA-Z]+://|#", s))
258
259
260 ############ tiny DOM start (snarfed and simplified from DaCHS stanxml)
261 # (used to write HTML)
262
263 class _Element(object):
264 """An element within a DOM.
265
266 Essentially, this is a simple way to build elementtrees. You can
267 reach the embedded elementtree Element as node.
268
269 Add elements, sequences, etc, using indexation, attributes using function
270 calls; names with dashes are written with underscores, python
271 reserved words have a trailing underscore.
272 """
273 _generator_t = type((x for x in ()))
274
275 def __init__(self, name):
276 self.node = etree.Element(name)
277
278 def add_text(self, tx):
279 """appends tx either the end of the current content.
280 """
281 if len(self.node):
282 self.node[-1].tail = (self.node[-1].tail or "")+tx
283 else:
284 self.node.text = (self.node.text or "")+tx
285
286 def __getitem__(self, child):
287 if child is None:
288 return
289
290 elif isinstance(child, basestring):
291 self.add_text(child)
292
293 elif isinstance(child, (int, float)):
294 self.add_text(str(child))
295
296 elif isinstance(child, _Element):
297 self.node.append(child.node)
298
299 elif hasattr(child, "__iter__"):
300 for c in child:
301 self[c]
302
303 else:
304 raise Exception("%s element %s cannot be added to %s node"%(
305 type(child), repr(child), self.node.tag))
306 return self
307
308 def __call__(self, **kwargs):
309 for k, v in kwargs.iteritems():
310 if k.endswith("_"):
311 k = k[:-1]
312 k = k.replace("_", "-")
313 self.node.attrib[k] = v
314 return self
315
316 def dump(self, encoding="utf-8", dest_file=sys.stdout):
317 etree.ElementTree(self.node).write(dest_file)
318
319
320 class _T(object):
321 """a very simple templating engine.
322
323 Essentially, you get HTML elements by saying T.elementname, and
324 you'll get an _Element with that tag name.
325
326 This is supposed to be instanciated to a singleton (here, T).
327 """
328 def __getattr__(self, key):
329 return _Element(key)
330
331 T = _T()
332
333
334 ############ The term class and associated code
335
336 def make_ttl_literal(ob):
337 """returns a turtle literal for an object.
338
339 Really, at this point only strings are supported. However, if something
340 looks like a URI (see is_URI), it's going to be treated as a URI.
341 """
342 if isinstance(ob, bool):
343 return "true" if ob else "false"
344
345 assert isinstance(ob, basestring)
346 if is_URI(ob):
347 return "<{}>".format(ob)
348 else:
349 if "\n" in ob:
350 return '"""{}"""'.format(ob.encode("utf-8"))
351 else:
352 return '"{}"'.format(ob.encode("utf-8").replace('"', '\\"'))
353
354
355 class Term(object):
356 """A term in our vocabulary.
357
358 These have predicate, label, description, parent, synonym attributes
359 and are constructed with arguments in that order. parent and synonym
360 can be left out.
361 """
362 def __init__(self, predicate, label, description, parent=None,
363 synonym=None):
364 self.predicate, self.label = predicate, label
365 self.description, self.parent = description, parent
366 self.synonym = synonym
367
368 def as_ttl(self):
369 """returns a turtle representation of this term in a string.
370 """
371 fillers = {
372 "predicate": self.predicate,
373 "label": make_ttl_literal(self.label),
374 "comment": make_ttl_literal(self.description),
375 }
376 template = [
377 "<#{predicate}> a rdf:Property",
378 "rdfs:label {label}",
379 "rdfs:comment {comment}"]
380
381 if self.parent:
382 template.append("rdfs:subPropertyOf {parent}")
383 fillers["parent"] = make_ttl_literal(self.parent)
384
385 if self.synonym:
386 template.append("owl:equivalentProperty {synonym}")
387 template.append("a owl:DeprecatedProperty")
388 fillers["synonym"] = make_ttl_literal(self.synonym)
389
390 return ";\n ".join(template).format(**fillers)+"."
391
392 def as_html(self):
393 """returns elementtree for an HTML table line for this term.
394 """
395 return T.tr[
396 T.td(class_="predicate")[self.predicate],
397 T.td(class_="label")[self.label],
398 T.td(class_="description")[self.description],
399 T.td(class_="parent")[self.parent or ""],
400 T.td(class_="preferred")[self.synonym or ""],]
401
402
403 ########### Vocabulary management
404
405 def _make_vocab_meta(parser, vocab_path, root_uri):
406 """returns a vocabulary dictionary for vocab_path from a ConfigParser
407 instance parser.
408
409 This makes sure all the necessary keys are present and that the
410 implied terms file is readable; also, it generates the terms file
411 name.
412 """
413 vocab_def = dict((key, value.decode("utf-8"))
414 for key, value in parser.items(vocab_path))
415
416 missing_keys = MANDATORY_KEYS-set(vocab_def)
417 if missing_keys:
418 raise ReportableError("Vocabulary definition for {} incomplete:"
419 " {} missing.".format(vocab_path, ", ".join(missing_keys)))
420
421 vocab_def["baseuri"] = root_uri+vocab_path
422 vocab_def["vocab_path"] = vocab_path
423 vocab_def["name"] = vocab_path.split("/")[-1]
424 if "filename" in vocab_def:
425 vocab_def["terms_fname"] = vocab_def["filename"]
426 else:
427 vocab_def["terms_fname"] = os.path.join(vocab_path, "terms.csv")
428 vocab_def["draft"] = vocab_def.get("draft", "False").lower()=="true"
429
430 try:
431 with open(vocab_def["terms_fname"]) as f:
432 _ = f.read()
433 except IOError:
434 raise ReportableError(
435 "Expected terms file {}.terms cannot be read.".format(
436 vocab_def["terms_fname"]))
437
438 return vocab_def
439
440
441 def read_meta(input_name, root_uri):
442 """reads the vocabulary configuration and returns a sequence
443 of vocabulary definition dicts.
444 """
445 parser = ConfigParser()
446 try:
447 with open(input_name) as f:
448 parser.readfp(f)
449 except IOError:
450 raise ReportableError(
451 "Cannot open or read vocabulary configuration {}".format(input_name))
452
453 meta = []
454 for vocab_path in parser.sections():
455 meta.append(_make_vocab_meta(parser, vocab_path, root_uri))
456 return meta
457
458
459 def make_stan_tree_HTML(vocab_def, make_body):
460 """returns a stan tree for the format-independent part of vocabulary
461 HTML.
462
463 make_body must be a function() -> stan returning the variable
464 content.
465 """
466 return T.html(xmlns="http://www.w3.org/1999/xhtml")[
467 T.head[
468 T.title["IVOA Vocabulary: "+vocab_def["title"]],
469 T.meta(http_equiv="content-type",
470 content="text/html;charset=utf-8"),
471 T.script(type="text/javascript") [JAVASCRIPT],
472 T.style(type="text/css")[
473 CSS_STYLE],],
474 T.body[
475 T.h1["IVOA Vocabulary: "+vocab_def["title"]],
476 T.div(class_="intro")[
477 T.p["This is the description of the namespace ",
478 T.code[vocab_def["baseuri"]],
479 " as of {}.".format(vocab_def["timestamp"])],
480 T.p(class_="draftwarning")["This vocabulary is not"
481 " yet approved by the IVOA. This means that"
482 " terms can still disappear without prior notice."]
483 if vocab_def["draft"] else "",
484 T.p(class_="description")[vocab_def["description"]]],
485 make_body()]]
486
487
488 def write_meta_inf(vocab_def):
489 """writes a "short" META.INF for use by the vocabulary TOC generator
490 at the IVOA web page to the current directory.
491 """
492 with open("META.INF", "w") as f:
493 f.write(u"Name: {}\n{}\n".format(
494 vocab_def["title"],
495 textwrap.fill(
496 vocab_def["description"],
497 initial_indent="Description: ",
498 subsequent_indent=" ")).encode("utf-8"))
499 if vocab_def["draft"]:
500 f.write("Status: Draft")
501
502
503 def write_htaccess(template, vocab_def):
504 """writes a customised .htaccess for content negotiation.
505
506 This must be called one level up from the ttl and html files.
507 """
508 install_base = urlparse.urlparse(vocab_def["baseuri"]).path+"/"
509 with open(".htaccess", "w") as f:
510 f.write(template.format(
511 install_base=install_base,
512 timestamp=vocab_def["timestamp"],
513 name=vocab_def["name"]))
514
515
516 ########### Parsing CSV, generating HTML, TTL, RDF/X
517
518 def add_rdf_file(turtle_name):
519 """uses rapper to turn our generated turtle file into a suitably named
520 RDF file.
521 """
522 with open(turtle_name[:-3]+"rdf", "w") as f:
523 rapper = subprocess.Popen(["rapper", "-iturtle", "-ordfxml",
524 turtle_name],
525 stdout=f,
526 stderr=subprocess.PIPE)
527 _, msgs = rapper.communicate()
528
529 if rapper.returncode!=0:
530 sys.stderr.write("Output of the failed rapper run:\n")
531 sys.stderr.write(msgs)
532 raise ReportableError("Conversion to RDF+XML failed; see output above.")
533
534
535 def write_ontology(vocab_def, terms):
536 """writes a turtle file for terms into the current directory.
537
538 The file will be called vocab_def["name"].ttl.
539 """
540 with open(vocab_def["name"]+".ttl", "w") as f:
541 meta_items = dict((k, make_ttl_literal(v))
542 for k, v in vocab_def.items())
543 meta_items["creators"] = ",\n ".join(
544 '[ foaf:name {} ]'.format(make_ttl_literal(n.strip()))
545 for n in vocab_def["authors"].split(";"))
546 f.write(TTL_HEADER_TEMPLATE.format(**meta_items))
547
548 for term in terms:
549 f.write(term.as_ttl())
550 f.write("\n\n")
551
552 add_rdf_file(vocab_def["name"]+".ttl")
553
554
555 def write_html(vocab_def, terms):
556 """writes an HTML-format documentation for terms into the current
557 directory.
558
559 The file will be called vocab_def["name"].html.
560 """
561 term_table = T.table(class_="terms")[
562 T.thead[
563 T.tr[
564 T.th(title="The formal name of the predicate as used in URIs"
565 )["Predicate"],
566 T.th(title="Suggested label for the predicate in human-facing UIs"
567 )["Label"],
568 T.th(title="Human-readable description of the predicate"
569 )["Description"],
570 T.th(title="If the predicate is in a wider-narrower relationship"
571 " to other predicates: The more general term.")["Parent"],
572 T.th(title="If the predicate has been superseded by another"
573 " term but is otherwise synonymous with it: The term that"
574 " should now be preferentially used")["Preferred"],
575 ],
576 ],
577 T.tbody[
578 [t.as_html() for t in terms]
579 ]
580 ]
581
582 def make_body():
583 return [
584 term_table,
585 T.p(class_="outro")["Alternate formats: ",
586 T.a(href=vocab_def["name"]+".rdf")["RDF"],
587 ", ",
588 T.a(href=vocab_def["name"]+".ttl")["Turtle"],
589 "."]]
590
591 doc = make_stan_tree_HTML(vocab_def, make_body)
592
593 with open(vocab_def["name"]+".html", "w") as f:
594 doc.dump(dest_file=f)
595
596
597 def parse_terms(src_name):
598 """returns a sequence of Terms from a CSV input.
599 """
600 parent_stack = []
601 last_predicate = None
602 terms = []
603 with open(src_name) as f:
604 for index, rec in enumerate(csv.reader(f, delimiter=";")):
605 rec = [(s or None) for s in rec]
606
607 try:
608 hierarchy_level = int(rec[1])
609 if hierarchy_level-1>len(parent_stack):
610 parent_stack.append(last_predicate)
611 while hierarchy_level-1<len(parent_stack):
612 parent_stack.pop()
613 last_predicate = rec[0]
614 if not is_URI(last_predicate):
615 last_predicate = "#"+last_predicate
616
617 if parent_stack:
618 parent = parent_stack[-1]
619 else:
620 parent = None
621
622 if len(rec)<5:
623 synonym = None
624 else:
625 synonym = rec[4]
626 if not is_URI(synonym):
627 synonym = "#"+synonym
628
629 terms.append(
630 Term(rec[0], rec[2].decode("utf-8"),
631 rec[3].decode("utf-8"), parent, synonym))
632 except IndexError:
633 sys.exit(
634 "{}, rec {}: Incomplete record {}.".format(
635 src_name, index, rec))
636
637
638 return terms
639
640
641 def build_vocab_csv(vocab_def, dest_root):
642 """builds, in a subdirectory named <dest_root>/<name>/<timestamp>, all files
643 necessary on the server side.
644
645 It also puts an .htaccess into the <name>/ directory that will redirect
646 clients to the appropriate files of this release based using content
647 negotiation.
648 """
649 try:
650 terms = parse_terms(vocab_def["terms_fname"])
651 except:
652 sys.stderr.write(
653 "The following error was raised from within {}:\n".format(
654 vocab_def["terms_fname"]))
655 raise
656
657 with work_dir(
658 os.path.join(
659 dest_root,
660 vocab_def["vocab_path"],
661 vocab_def["timestamp"])):
662 write_ontology(vocab_def, terms)
663 write_html(vocab_def, terms)
664
665 with work_dir(
666 os.path.join(dest_root, vocab_def["vocab_path"])):
667 write_htaccess(HT_ACCESS_TEMPLATE_CSV, vocab_def)
668 write_meta_inf(vocab_def)
669
670
671 ########### Parsing SKOS, generating HTML
672
673 try:
674 import skosify
675 from rdflib.term import URIRef
676 except ImportError:
677 sys.stderr.write("skosify and/or rdflib python modules missing;"
678 " this will break as soon as SKOS vocabularies are processed.\n")
679
680 def get_related_items(voc, relationship, term, make_links=False):
681 """returns "terms" related to term in voc with relationship.
682
683 relationship is a (text) URI of the relationship (i.e., predicate),
684 term is a (text) URI.
685
686 This yields stan li elements. If make_links is true, it will try
687 to produce sensible links.
688 """
689 for _, _, item in voc.triples((term, URIRef(relationship), None)):
690 # HACK ALERT: We should be checking if the URI before the # actually
691 # is our vocabulary URI here.
692 if "#" in item:
693 label = "#"+item.split("#")[-1]
694 yield T.li[T.a(href=label)[label]]
695 else:
696 yield T.li[item]
697
698
699 def make_skos_row(voc, term):
700 """returns a (stan) HTML row for a skos term.
701
702 The columns are declared in the term_table template in
703 write_html_for_skos.
704 """
705 return T.tr(id=term.split("#")[-1])[
706 T.td["#"+term.split("#")[-1]],
707 T.td[ # labels
708 T.ul(class_="compactlist")[
709 itertools.chain(
710 get_related_items(voc,
711 "http://www.w3.org/2004/02/skos/core#prefLabel",
712 term),
713 get_related_items(voc,
714 "http://www.w3.org/2004/02/skos/core#altLabel",
715 term))]],
716 T.td[ # human-readable description
717 T.ul(class_="compactlist")[get_related_items(voc,
718 "http://www.w3.org/2004/02/skos/core#definition",
719 term)]],
720 T.td[ # broader
721 T.ul(class_="compactlist")[get_related_items(voc,
722 "http://www.w3.org/2004/02/skos/core#broader",
723 term)]],
724 T.td[ # narrower
725 T.ul(class_="compactlist")[get_related_items(voc,
726 "http://www.w3.org/2004/02/skos/core#narrower",
727 term)]],
728 ]
729
730
731 def write_html_for_skos(vocab_def, voc):
732 """returns an HTML representation of the skosify.skosify result voc.
733 """
734 term_table = T.table(class_="terms")[
735 T.thead[
736 T.tr[
737 T.th(title="The formal name of the predicate as used in URIs"
738 )["Predicate"],
739 T.th(title="Suggested label(s) for the predicate"
740 " in human-facing UIs (preferred label first)"
741 )["Label"],
742 T.th(title="Human-readable description of the predicate"
743 )["Description"],
744 T.th(title="More general term(s)")["Broader"],
745 T.th(title="More specialised terms(s)")["Narrower"],
746 ],
747 ],
748 T.tbody[
749 [make_skos_row(voc, t) for t in sorted(set(voc.subjects()))]
750 ]
751 ]
752
753 def make_body():
754 return [
755 term_table,
756 T.p(class_="outro")["Alternate format: ",
757 T.a(href=vocab_def["name"]+".skos")["SKOS"],
758 "."]]
759
760 doc = make_stan_tree_HTML(vocab_def, make_body)
761
762 with open(vocab_def["name"]+".html", "w") as f:
763 doc.dump(dest_file=f)
764
765
766 def build_vocab_skos(vocab_def, dest_root):
767 voc = skosify.skosify(vocab_def["terms_fname"])
768 with open(vocab_def["terms_fname"]) as f:
769 skos_source = f.read()
770
771 with work_dir(
772 os.path.join(
773 dest_root,
774 vocab_def["vocab_path"],
775 vocab_def["timestamp"])):
776 name_stem = os.path.splitext(
777 os.path.basename(vocab_def["terms_fname"]))[0]
778 with open(name_stem+".skos", "wb") as dest:
779 dest.write(skos_source)
780 write_html_for_skos(vocab_def, voc)
781
782 with work_dir(
783 os.path.join(dest_root, vocab_def["vocab_path"])):
784 write_htaccess(HT_ACCESS_TEMPLATE_SKOS, vocab_def)
785 write_meta_inf(vocab_def)
786
787
788 ########### User interface
789
790 def parse_command_line():
791 import argparse
792 parser = argparse.ArgumentParser(
793 description='Creates RDF, HTML and turtle files for a set of vocabularies.')
794 parser.add_argument("vocab_config",
795 help="Name of the vocabularies configuration file.",
796 type=str)
797 parser.add_argument("--root-uri",
798 help="Use URI as the common root of the vocabularies instead of"
799 " the official IVOA location as the root of the vocabulary"
800 " hierarchy. This is for test installations at this point.",
801 action="store",
802 dest="root_uri",
803 default="http://www.ivoa.net/rdf/",
804 metavar="URI")
805 parser.add_argument("--build-only",
806 help="Only build VOCNAME (if available in vocab.conf)",
807 action="store",
808 dest="build_only",
809 metavar="VOCNAME")
810 parser.add_argument("--dest-dir",
811 help="Write HTML and RDF output files to PATH (default: build).",
812 action="store",
813 dest="dest_dir",
814 default="build",
815 metavar="PATH")
816 args = parser.parse_args()
817
818 if not args.root_uri.endswith("/"):
819 args.root_uri = args.root_uri+"/"
820
821 return args
822
823
824 def main():
825 args = parse_command_line()
826 meta = read_meta(args.vocab_config, args.root_uri)
827
828 for vocab_def in meta:
829 if args.build_only and args.build_only!=vocab_def["name"]:
830 continue
831
832 # dispatch builders on source extension -- ah well.
833 source_file = vocab_def["terms_fname"]
834 if source_file.endswith(".csv"):
835 build_vocab_csv(vocab_def, args.dest_dir)
836 elif source_file.endswith(".skos"):
837 build_vocab_skos(vocab_def, args.dest_dir)
838 else:
839 raise ReportableError("Unknown source format in {}".format(
840 source_file))
841
842
843 if __name__=="__main__":
844 try:
845 main()
846 except ReportableError, msg:
847 sys.stderr.write("*** Fatal: {}\n".format(msg))
848 sys.exit(1)
849
850 # vi:sw=4:et:sta

Properties

Name Value
svn:executable *

msdemlei@ari.uni-heidelberg.de
ViewVC Help
Powered by ViewVC 1.1.26