diff --git a/README.md b/README.md index 5b2a1c8..347bccc 100644 --- a/README.md +++ b/README.md @@ -81,8 +81,8 @@ graph libraries in Python: ## Build Instructions -Note: most use cases won't need to build this package locally, -and instead will simply install directly from +**Note: most use cases won't need to build this package locally.** +Instead, simply install from [PyPi](https://pypi.python.org/pypi/kglab) or [Conda](https://docs.conda.io/). @@ -122,6 +122,8 @@ To generate documentation pages, this project uses: * [`MkDocs`](https://www.mkdocs.org/) * [`makedocs-material`](https://squidfunk.github.io/mkdocs-material/) + * [`MathJax`](https://www.mathjax.org/) + * [`pymdown-extensions`](https://facelessuser.github.io/pymdown-extensions/) * [`Jupyter`](https://jupyter.org/install) * [`nbconvert`](https://nbconvert.readthedocs.io/) * [`Selenium`](https://selenium-python.readthedocs.io/) diff --git a/docs/ack.md b/docs/ack.md index e8430f4..a333c24 100644 --- a/docs/ack.md +++ b/docs/ack.md @@ -1,6 +1,6 @@ # Acknowledgements -## Kudos +## Contributors and Supporters Many thanks to contributors: [@jake-aft](https://github.com/jake-aft), diff --git a/docs/biblio.md b/docs/biblio.md index c94370b..e056fec 100644 --- a/docs/biblio.md +++ b/docs/biblio.md @@ -1,168 +1,187 @@ # Bibliography -## A +## – A – ### alexopoulos2020 [*Semantic Modeling for Data: Avoiding Pitfalls and Breaking Dilemmas*](https://www.goodreads.com/book/show/53483743-semantic-modeling-for-data) -Panos Alexopoulos +**Panos Alexopoulos** O'Reilly Media (2020) ### anderson2020dt [*Data Teams: A Unified Management Model for Successful Data-Focused Teams*](https://www.apress.com/us/book/9781484262276) -Jesse Anderson +**Jesse Anderson** Apress (2020) -## B +## – B – ### bell2019 [*Get Programming: Learn to code with Python*](https://www.manning.com/books/get-programming) -Ana Bell +**Ana Bell** Manning (2018) ### breiman2001 ["Statistical Modeling: The Two Cultures"](https://doi.org/10.1214/ss/1009213726) -Leo Breiman +**Leo Breiman** *Statist. Sci.* 16:3 (2001) ### brewer2012cap ["CAP Twelve years later: How the 'Rules' have Changed"](https://doi.org/10.1109/MC.2012.37) -Eric Brewer +**Eric Brewer** *Computer* 45:2 (2012) -## C +## – C – ### ceder2018 [*The Quick Python Book, Third Edition*](https://www.manning.com/books/the-quick-python-book-third-edition) -Naomi Ceder +**Naomi Ceder** Manning (2018) ### chollet2017 [*Deep Learning with Python*](https://www.manning.com/books/deep-learning-with-python) -François Chollet +**François Chollet** Manning (2017) -## D +## – D – ### das2020meta ["Metadata Day 2020"](https://metadataday2020.splashthat.com/) -Shirshanka Das, Paco Nathan, Nadiya Hayes, Joe M. Hellerstein, -Kapil Surlaker, Chris Williams, Natasha F. Noy, -Daniella Lowenberg, Ian Mulvany, Mark Grover, Alejandro Saucedo, -Deborah L. McGuinness, Ted Habermann, Charles Smith, Julien Le Dem, -Deepak Chandramouli, Igor Perisic, Sunheng Taing, Satyen Sangani, -Aaron Kalb, Daniel Rincon Silva +**Shirshanka Das**, **Paco Nathan**, **Nadiya Hayes**, **Joe M. Hellerstein**, +**Kapil Surlaker**, **Chris Williams**, **Natasha F. Noy**, +**Daniella Lowenberg**, **Ian Mulvany**, **Mark Grover**, **Alejandro Saucedo**, +**Deborah L. McGuinness**, **Ted Habermann**, **Charles Smith**, **Julien Le Dem**, +**Deepak Chandramouli**, **Igor Perisic**, **Sunheng Taing**, **Satyen Sangani**, +**Aaron Kalb**, **Daniel Rincon Silva** LinkedIn (2020) -## G +## – G – ### gosnell2020 [*The Practitioner's Guide to Graph Data*](https://www.goodreads.com/book/show/50204616-the-practitioner-s-guide-to-graph-data) -Denise Gosnell, Matthias Broecheler +**Denise Gosnell**, **Matthias Broecheler** O'Reilly Media (2020) ### gruber1993ata ["A translation approach to portable ontology specifications"](https://doi.org/10.1006/KNAC.1993.1008) -Thomas R. Gruber -*Knowledge Acquisition* 5, pp. 199-220 (1993) +**Thomas R. Gruber** +*Knowledge Acquisition* 5 (1993) -## H +## – H – ### hellersteinsgsa17 ["Ground: A Data Context Service"](http://cidrdb.org/cidr2017/papers/p111-hellerstein-cidr17.pdf) -Joseph M. Hellerstein, Vikram Sreekanti, Joseph E. Gonzalez, James Dalton, Akon Dey, Sreyashi Nag, Krishna Ramachandran, Sudhanshu Arora, Arka Bhattacharyya, Shirshanka Das, Mark Donsky, Gabriel Fierro, Chang She, Carl Steinbach, Venkat Subramanian, Eric Sun +**Joseph M. Hellerstein**, **Vikram Sreekanti**, **Joseph E. Gonzalez**, +**James Dalton**, **Akon Dey**, **Sreyashi Nag**, **Krishna Ramachandran**, +**Sudhanshu Arora**, **Arka Bhattacharyya**, **Shirshanka Das**, +**Mark Donsky**, **Gabriel Fierro**, **Chang She**, **Carl Steinbach**, +**Venkat Subramanian**, **Eric Sun** *CIDR* (2017) ### hogan2020knowledge ["Knowledge Graphs"](https://arxiv.org/abs/2003.02320) -Aidan Hogan, Eva Blomqvist, Michael Cochez, Claudia d'Amato, Gerard de Melo, Claudio Gutierrez, José Emilio Labra Gayo, Sabrina Kirrane, Sebastian Neumaier, Axel Polleres, Roberto Navigli, Axel-Cyrille Ngonga Ngomo, Sabbir M. Rashid, Anisa Rula, Lukas Schmelzeisen, Juan Sequeda, Steffen Staab, Antoine Zimmermann +**Aidan Hogan**, **Eva Blomqvist**, **Michael Cochez**, **Claudia d'Amato**, +**Gerard de Melo**, **Claudio Gutierrez**, **José Emilio Labra Gayo**, +**Sabrina Kirrane**, **Sebastian Neumaier**, **Axel Polleres**, +**Roberto Navigli**, **Axel-Cyrille Ngonga Ngomo**, **Sabbir M. Rashid**, +**Anisa Rula**, **Lukas Schmelzeisen**, **Juan Sequeda**, **Steffen Staab**, +**Antoine Zimmermann** *arXiv* (2020) -## J +## – J – ### jonas2019cloud ["Cloud Programming Simplified: A Berkeley View on Serverless Computing"](https://arxiv.org/abs/1902.03383) -Eric Jonas, Johann Schleier-Smith, Vikram Sreekanti, Chia-Che Tsai, Anurag Khandelwal, Qifan Pu, Vaishaal Shankar, Joao Carreira, Karl Krauth, Neeraja Yadwadkar, Joseph E. Gonzalez, Raluca Ada Popa, Ion Stoica, David A. Patterson +**Eric Jonas**, **Johann Schleier-Smith**, **Vikram Sreekanti**, +**Chia-Che Tsai**, **Anurag Khandelwal**, **Qifan Pu**, **Vaishaal Shankar**, +**Joao Carreira**, **Karl Krauth**, **Neeraja Yadwadkar**, **Joseph E. Gonzalez**, +**Raluca Ada Popa**, **Ion Stoica**, **David A. Patterson** *arXiv* (2019) -## K +## – K – ### kreps2014 [*I Heart Logs: Event Data, Stream Processing, and Data Integration*](https://www.confluent.io/ebook/i-heart-logs-event-data-stream-processing-and-data-integration/) -Jay Kreps +**Jay Kreps** O'Reilly Media (2014) -## L +## – L – + +### ledem2013hadoop + +["Parquet: Columnar storage for the people""](https://www.slideshare.net/julienledem/parquet-hadoop-summit-2013) +**Julien Le Dem** +*Hadoop Summit* (2013) ### lenat1982aaai ["Heuretics: Theoretical and Experimental Study of Heuristic Rules"](https://www.aaai.org/Library/AAAI/1982/aaai82-038.php) -Douglas B. Lenat +**Douglas B. Lenat** AAAI (1982) ### lenat1984ai ["Why AM and Eurisko appear to work"](https://doi.org/10.1016/0004-3702(84)90016-X) -Douglas B. Lenat, John Seely Brown +**Douglas B. Lenat**, **John Seely Brown** *Artificial Intelligence* 23:3 (1984) ### linden2006early ["Early Amazon: Splitting the website"](http://glinden.blogspot.com/2006/02/early-amazon-splitting-website.html) -Greg Linden +**Greg Linden** *Geeking with Greg* (2006) ### lorica2020nlp ["2020 NLP Survey Report"](https://gradientflow.com/2020nlpsurvey/) -Ben Lorica, Paco Nathan +**Ben Lorica**, **Paco Nathan** Gradient Flow (2020) ### lorica2020rai ["Responsible AI in Practice"](https://gradientflow.com/ResponsibleAI2020) -Ben Lorica, Paco Nathan, Gina Blaber, Andrew Burt, Rumman Chowdhury, Yishay Carmiel +**Ben Lorica**, **Paco Nathan**, **Gina Blaber**, **Andrew Burt**, +**Rumman Chowdhury**, **Yishay Carmiel** Gradient Flow (2020) -## N +## – N – ### nathan2014jem [*Just Enough Math*](https://derwen.ai/jem) -Paco Nathan +**Paco Nathan** O'Reilly Media (2014) ### negro2021 [*Graph-Powered Machine Learning*](https://www.manning.com/books/graph-powered-machine-learning) -Alessandro Negro +**Alessandro Negro** Manning (2021) ### noy2001ontology ["Ontology Development 101: A Guide to Creating Your First Ontology"](http://www-ksl.stanford.edu/people/dlm/papers/ontology-tutorial-noy-mcguinness-abstract.html) -Natalya F. Noy, Deborah L. McGuinness +**Natalya F. Noy**, **Deborah L. McGuinness** *Stanford Knowledge Systems Laboratory Technical Report KSL-01-05* (2001) -## P +## – P – ### perrone2020network ["Network visualizations with Pyvis and VisJS"](https://arxiv.org/abs/2006.04951) -Giancarlo Perrone, Jose Unpingco, Haw-minn Lu +**Giancarlo Perrone**, **Jose Unpingco**, **Haw-minn Lu** *arXiv* (2020) diff --git a/docs/concepts.md b/docs/concepts.md index dae438d..2089f5f 100644 --- a/docs/concepts.md +++ b/docs/concepts.md @@ -1,5 +1,10 @@ # Graph Concepts +## DRAFT: Work in progress + +This material is a work in progress, at "rough draft" stage. + + The primary abstractions used in **kglab** are based on a small set of Python classes. These class definitions can be subclassed and extended to handle diff --git a/docs/glossary.md b/docs/glossary.md index 9288b14..2df242f 100644 --- a/docs/glossary.md +++ b/docs/glossary.md @@ -1,6 +1,11 @@ # Glossary -## A +## DRAFT: Work in progress + +This material is a work in progress, at "rough draft" stage. + + +## – A – ### abstraction layer @@ -10,7 +15,7 @@ A technology implementing a [*separation of concerns*](#separation-of-concerns) see: -## C +## – C – ### cloud computing @@ -18,7 +23,7 @@ see: ### computable content -## D +## – D – ### data context @@ -34,7 +39,7 @@ see: ### data strategy -## G +## – G – ### graph algorithms @@ -42,7 +47,7 @@ see: ### graph-based data science -## K +## – K – ### KG @@ -64,19 +69,19 @@ and its community ### knowledge graph embedding -## M +## – M – ### machine learning see: -## N +## – N – ### natural language see: -## O +## – O – ### OSFA @@ -85,13 +90,13 @@ abbr. "One size fits all", a common antipattern in technology see: -## P +## – P – ### probabilistic graph inference ### property graph -## R +## – R – ### RDF @@ -103,13 +108,13 @@ abbr. *Resource Description Framework* see: -## S +## – S – ### semantic technologies ### separation of concerns -## W +## – W – ### W3C diff --git a/docs/javascripts/config.js b/docs/javascripts/config.js new file mode 100644 index 0000000..ece5986 --- /dev/null +++ b/docs/javascripts/config.js @@ -0,0 +1,12 @@ +window.MathJax = { + tex: { + inlineMath: [["\\(", "\\)"]], + displayMath: [["\\[", "\\]"]], + processEscapes: true, + processEnvironments: true + }, + options: { + ignoreHtmlClass: ".*|", + processHtmlClass: "arithmatex" + } +}; diff --git a/docs/use_case.md b/docs/use_case.md index 530e8b4..6c095f0 100644 --- a/docs/use_case.md +++ b/docs/use_case.md @@ -1,5 +1,10 @@ # Use Cases +## DRAFT: Work in progress + +This material is a work in progress, at "rough draft" stage. + + ## Data Context ["data context"]( http://cidrdb.org/cidr2017/papers/p111-hellerstein-cidr17.pdf) – diff --git a/docs/what.md b/docs/what.md index b0f9c64..484312c 100644 --- a/docs/what.md +++ b/docs/what.md @@ -1,45 +1,52 @@ # What is a Knowledge Graph? -## Just Enough Graph Theory +## DRAFT: Work in progress + +This material is a work in progress, at "rough draft" stage. -Math: - G = { V, E } + +## Just Enough Graph Theory In a pure mathematical form, where a *node* (or *vertex*) can connect through an *edge* (or *arc* or *link*) to another node. +$$ +G=\{V, E\} +$$ + In that case an *adjacency matrix* can represent the entire graph. Each node has a row and a column in the matrix. -https://en.wikipedia.org/wiki/Adjacency_matrix -https://mathworld.wolfram.com/AdjacencyMatrix.html + * https://en.wikipedia.org/wiki/Adjacency_matrix + * https://mathworld.wolfram.com/AdjacencyMatrix.html -In the simplest form, there's a `1` value to represent an edge between nodes or a `0` otherwise. +In the simplest form, a `1` value in the matrix element represents an +edge between nodes or a `0` otherwise. -symmetric for undirected graphs -asymmetric for directed graphs + * symmetric for undirected graphs + * asymmetric for directed graphs If a directed graph has *weights* on its edges (i.e., to represent the probability of an event between two nodes) then replace the `1` value with the weight or probability. This is called a *stochastic matrix* -https://en.wikipedia.org/wiki/Stochastic_matrix -transitions of Markov chain (state) + * https://en.wikipedia.org/wiki/Stochastic_matrix + * transitions of Markov chain (state) There's an entire field of *algebraic graph theory* that translates between *graph theory* and *linear algebra*. eigenvalues, eigenvectors, spectrum -This leads to *factorization* (or *decomposition*): -https://en.wikipedia.org/wiki/Matrix_decomposition -https://sparse.tamu.edu/about -https://www.cs.purdue.edu/homes/dgleich/ +non-negative, symmetric properties allow for *factorization* (or *decomposition*): + * https://en.wikipedia.org/wiki/Matrix_decomposition + * https://sparse.tamu.edu/about + * https://www.cs.purdue.edu/homes/dgleich/ -## RDF Graph +## RDF Graph * RDF graph, [*semantic technologies*](../glossary/#semantic-technologies) * KG compare/contrast with [*property graph*](../glossary/#property-graph) @@ -55,8 +62,9 @@ narrative arc: => [anderson2020dt](../biblio/#anderson2020dt) with -[breiman2001](../biblio/#breiman2001) in-between - +[breiman2001](../biblio/#breiman2001) +[brewer2012cap](../biblio/#brewer2012cap) +in-between In 2018, Gartner began to acknowledge the term [*knowledge graph*](../glossary/#knowledge-graph) diff --git a/mkdocs.yml b/mkdocs.yml index 801c6c1..156ad9d 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -52,20 +52,29 @@ theme: name: material icon: repo: fontawesome/brands/github - logo: assets/logo.png favicon: assets/favicon.png - -extra_css: - - stylesheets/extra.css + logo: assets/logo.png + features: + - navigation.instant plugins: - mknotebooks - git-revision-date +extra_css: + - stylesheets/extra.css + +extra_javascript: + - javascripts/config.js + - https://polyfill.io/v3/polyfill.min.js?features=es6 + - https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-mml-chtml.js + markdown_extensions: - admonition - codehilite - footnotes + - pymdownx.arithmatex: + generic: true - toc: toc_depth: 3 permalink: true diff --git a/preview.py b/preview.py index 36eebbe..06f99a9 100755 --- a/preview.py +++ b/preview.py @@ -26,7 +26,7 @@ def static_proxy (path=""): else: suffix = PurePosixPath(path).suffix - if suffix not in [".css", ".js", ".png", ".svg", ".map"]: + if suffix not in [".css", ".js", ".map", ".png", ".svg", ".xml"]: path = os.path.join(path, "index.html") return send_from_directory(DOCS_FILES, path)