Skip to content

knitzschke/odscwest-search-engine-workshop

 
 

Repository files navigation

Search Engine Workshop

About

Handson workshop for building a semantic search engine.

Setup

If you came to this repo, during a workshop visit this custom jupyter hub with all the dependencies already set up.

Otherwise, consider using Binder

Content

Notebooks

  1. Data Fetching Internal notebooks that show how to fetch a dump of the Stack Overflow XML

  2. Data Processing notebook Process the XML dump and save to smaller parquet files

  3. Non Deep Learning Retrieval

Link

Shows how to index and retrieve documents using ElasticSearch

  1. Deep Learning Retrieval

Show how to index and retrieves documents using a finetuned Deep Learning Retriever Link

Sample notebook for scross encoder taken from SentenceTransformer docs Link

  1. ANN Shows how to speed up Deep Learning retrieval by exploring different ANN indexes Link

Slides

[ODSC 2022 Slides][assets/slides_odsc2022.pdf)

Contact

For help or feedback, please reach out to :

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 99.7%
  • Other 0.3%