Skip to content
fozziethebeat edited this page Oct 27, 2011 · 5 revisions

Package Layout

The S-Space Package is split up into three types of packages

  1. Core utility packages. These include tools for common data structures, matrices, and vectors
  2. Core interfaces and tools for developing Semantic Space algorithms.
  3. Semantic Space imlementations

Utility Packages

  • Vector: A collection of fast vector implementations with a focus on sparse double vectors, but includes support for vectors of different primitive types. These can be serialized and deserialized to/from a variety of formats. Vectors serve as the main data structure backing our distributional semantics.
  • Matrix: A collection of fast matrix implementations with a focus on sparse matrices. These can be serialized and deserialized to/from a variety of formats.
  • Util: A collection of common data structures molded after the java.util package. This includes some tools that are similar in spirit to GNU Trove or Google Guava.

Interfaces and Tools

  • [Common] (/fozziethebeat/S-Space/wiki/Common): The core interface for all Semantic Space implementations, abstract S-Space implementations and serialization utilities, and a collection of Similarity metrics.
  • [Clustering] (/fozziethebeat/S-Space/wiki/Clustering): A collection of interfaces and implmentations for clustering algorithms.
  • [Dependency] (/fozziethebeat/S-Space/wiki/Dependency) : A tools for interacting with dependency parsed corpora and dependency parse trees.
  • [Basis] (/fozziethebeat/S-Space/wiki/Basis): A collection tools for mapping features to unique indices.
  • [Index] (/fozziethebeat/S-Space/wiki/Index): A collection of tools for generating index vectors that represent terms in a reduced sub-space.
  • [Text] (/fozziethebeat/S-Space/wiki/Text): A collection of tools for interacting with various corpora.
  • [Tools] (/fozziethebeat/S-Space/wiki/Tools): A collection of random tools for intermediate data processing. These often relate to specific experiments we've run using the S-Space package.
  • [Evaluation] (/fozziethebeat/S-Space/wiki/Evaluation): A collection of semantic evaluation tasks that compare Semantic Space results to a variety of human gold standards.

Implementations

  • [Mains] (/fozziethebeat/S-Space/wiki/mains): All of the mains that run Semantic Space algorithms.
  • [Beagle] (/fozziethebeat/S-Space/wiki/Beagle): A hologram based representation of word co-occurrence.
  • [Coals] (/fozziethebeat/S-Space/wiki/Coals): A reduced co-occurrence based Semantic Space
  • [Dependency Random Indexing] (/fozziethebeat/S-Space/wiki/DependencyRandomIndexing) : A version of Random Indexing that finds co-occurrences in dependency parse trees.
  • [Dependency Vector Space] (/fozziethebeat/S-Space/wiki/DependencyVectorSpace) : A dependency tree based co-occurrence Semantic Space.
  • [Explicit Semantic Analysis] (/fozziethebeat/S-Space/wiki/ExplicitSemanticAnalysis) : An extension to the Vector Space Model that allows for summaries of new documents.
  • [Grefenstette] (/fozziethebeat/S-Space/wiki/Grefenstette) : An early co-occurrence based Semantic Space that extracts occurrences from parse trees.
  • [Hyperspace Analogue To Language] (/fozziethebeat/S-Space/wiki/HyperspaceAnalogueToLanguage): A word-occurence space that encodes word ordering.
  • [Incremental Semantic Space] (/fozziethebeat/S-Space/wiki/IncrementalSemanticSpace) : A second order index vector based co-occurence space.
  • [Latent Relational Analysis] (/fozziethebeat/S-Space/wiki/LatentRelationalAnalysis) : A relational analysis based semantic space.
  • [Latent Semantic Analysis] (/fozziethebeat/S-Space/wiki/LatentSemanticAnalysis) : A term by document based space that reduces the feature space by Singular Value Decomposition. Perhaps the most well known Semantic Space.
  • [Nonlinear Semantic Spaces] (/fozziethebeat/S-Space/wiki/NonLinearSpace): Variations on HAL and LSA that reduce the feature spaces with non-linear reduction methods.
  • [Purandare & Pedersen] (/fozziethebeat/S-Space/wiki/PurandareAndPedersen) : An early first order word sense induction model.
  • [Random Indexing] (/fozziethebeat/S-Space/wiki/RandomIndexing) : A co-occurrence space that automatically reduces the feature space by using index vectors
  • [Reflective Random Indexing] (/fozziethebeat/S-Space/wiki/ReflectiveRandomIndexing) : A Second order extention to Random Indexing.
  • [Temporal Random Indexing] (/fozziethebeat/S-Space/wiki/TemporalRandomIndexing) : A temporal extension to Random Indexing that tracks semantic variations overtime.
  • [Structured Vector Space] (/fozziethebeat/S-Space/wiki/StructuredVectorSpace) : A multi-vector semantic space that separates a semantic vector based on dependency relations.
  • [Vector Space Model] (/fozziethebeat/S-Space/wiki/VectorSpaceModel) : An early term by document semantic space.
  • [Wordsi] (/fozziethebeat/S-Space/wiki/Wordsi) : A general framework for building Word Sense Induction models.