Class PassageScorer

    • Field Summary

      Fields 
      Modifier and Type Field Description
      (package private) float b
      BM25 b parameter, controls length normalization.
      (package private) float k1
      BM25 k1 parameter, controls term frequency normalization
      (package private) float pivot
      A pivot used for length normalization.
    • Constructor Summary

      Constructors 
      Constructor Description
      PassageScorer()
      Creates PassageScorer with these default values: k1 = 1.2, b = 0.75.
      PassageScorer​(float k1, float b, float pivot)
      Creates PassageScorer with specified scoring parameters
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      float norm​(int passageStart)
      Normalize a passage according to its position in the document.
      float score​(Passage passage, int contentLength)  
      float tf​(int freq, int passageLen)
      Computes term weight, given the frequency within the passage and the passage's length.
      float weight​(int contentLength, int totalTermFreq)
      Computes term importance, given its in-document statistics.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • k1

        final float k1
        BM25 k1 parameter, controls term frequency normalization
      • b

        final float b
        BM25 b parameter, controls length normalization.
      • pivot

        final float pivot
        A pivot used for length normalization.
    • Constructor Detail

      • PassageScorer

        public PassageScorer()
        Creates PassageScorer with these default values:
        • k1 = 1.2,
        • b = 0.75.
        • pivot = 87
      • PassageScorer

        public PassageScorer​(float k1,
                             float b,
                             float pivot)
        Creates PassageScorer with specified scoring parameters
        Parameters:
        k1 - Controls non-linear term frequency normalization (saturation).
        b - Controls to what degree passage length normalizes tf values.
        pivot - Pivot value for length normalization (some rough idea of average sentence length in characters).
    • Method Detail

      • weight

        public float weight​(int contentLength,
                            int totalTermFreq)
        Computes term importance, given its in-document statistics.
        Parameters:
        contentLength - length of document in characters
        totalTermFreq - number of time term occurs in document
        Returns:
        term importance
      • tf

        public float tf​(int freq,
                        int passageLen)
        Computes term weight, given the frequency within the passage and the passage's length.
        Parameters:
        freq - number of occurrences of within this passage
        passageLen - length of the passage in characters.
        Returns:
        term weight
      • norm

        public float norm​(int passageStart)
        Normalize a passage according to its position in the document.

        Typically passages towards the beginning of the document are more useful for summarizing the contents.

        The default implementation is 1 + 1/log(pivot + passageStart)

        Parameters:
        passageStart - start offset of the passage
        Returns:
        a boost value multiplied into the passage's core.
      • score

        public float score​(Passage passage,
                           int contentLength)