Skip to content

Questions needed to be answered

Minchul Park edited this page Mar 15, 2016 · 1 revision
  • Problem definition

    • What is the value of the topic? How will this be evaluated?
    • What are the challenges?
    • What ambiguities do you expect to encounter and have to resolve?
    • What kind of features would you use?
      • The words? Perhaps other features as well?
    • What is the starting point?
    • What improvements would be made, and how can they be achieved?
    • Problem specific questions
      • Are there two distinct tasks, summarization and emotion detection?
      • What are the 8 sentiment classes?
      • Would it identify specific parts of the writing that lead to different reactions? How?
      • What exactly is meant by "text pertaining to the character"?
  • Methodology (== How to do)

    • What technique will be used?
    • What models/procedures would be used for the task?
    • How would the task be done?
      • How will you identify the characters in the story?
      • How would inferences be drawn from the corpus?
      • How would meanings be extracted from the question?
        • How would they be represented?
        • How would that be used to query the knowledge base?
      • How would the question be transformed into a geographical database query?
      • How would the prediction work?
  • Dataset

    • What language would the data be in?
    • What corpus will be used?
    • Is there any particular reason you want to use a certain language?
    • Will you use spoken language or written? (If written, how is it relevant to dialogue systems?)
  • Evaluation

    • How will the result be evaluated?
    • What defines a good data for a certain task?
    • Does the content come with explicit reviews (such as a star rating)?
    • If not, then perhaps a site can be found (like yelp) with such ratings? That can then be crawled to get data for training the models.
    • If the task is performed on unlabeled data, how will performance be evaluated?
  • Advice

    • Creating an application is great, but it shouldn't consume too much of the effort; the majority of the effort should be spent on language processing.
    • You should look at the particulars of your corpus, and also do a small literature review to see what kind of techniques have been used in the field.
    • It would be good to do a short review of existing methods and what they can accomplish.
Clone this wiki locally