Drill-down queries find longer phrases which include the query string. The results indicate the number and percentage of occurrences of each phrase accounted for by individual longer phrases, highlighting especially frequent "expansions" to help detect the boundaries of recurrent phrases.
By default, drill-down queries…
To restrict the scope or direction matched, insert start / end of word / phrase codes:
Drill-down results sum all counts of a given string, ignoring PoS tags. Consequently frequencies may be greater than those reported by other query types when the same sequence of words has been tagged in different ways.
Results are displayed as nested boxes containing all expansions which meet the user's frequency threshold. From the lowest value of n to the highest, each expansion of matching phrases by one word is nested in a box one level deeper.
Take for example the query [[:<:]]trip[[:>:]] for n values 2-6 (partial results below, click here to view a larger sample). The most frequent two-word match is trip to. This "top-level phrase" (lowest value of n) appears in the outermost box (solid lines). Three-word expansions of trip to such as a trip to are nested one level lower in dotted-line boxes within the box of their "parent phrase" trip to. In turn a trip to has its own four-word "child phrases" such as a trip to the, on a trip to nested in a dotted-line box one more level down. Sibling phrases of a trip to like trip to the and the trip to are nested at the same level under the top-level box.
The boxes are like modified Venn diagrams using rectangles instead of circles. Each box contains a subset of the occurrences of the parent phrase whose box contains it. The set of phrases containing trip to is a subset of all occurrences of trip, and trip to the is a subset of all occurrences of trip to. A phrase's level of nesting reflects the number of supersets above it: trip ⊃ trip to ⊃ trip to the ⊃ trip to the moon (⊃ 'superset of').
Each parent may have many children, i.e. each set may have many subsets. For orientation in a long list, hover the mouse over a phrase to display the parent phrase and its frequency. To help identify significant recurrent phrases, "children" are color-coded if they account for more than a given percentage (specified above under "Options") of all occurrences of either the top-level "ancestor" phrase or their parent phrase (shades of red and green respectively, varying in intensity with the actual percentage).
Sample results
| a | trip to | 413 | 29.88 | 29.88 | < | 29.884 | |||||||||||||||||
|
|||||||||||||||||||||||
| trip to | the | 335 | 24.24 | > | 24.24 | 24.240 | |||||||||||||||||
|
|||||||||||||||||||||||
| the | trip to | 141 | 10.20 | 40.09 | < | 10.203 | |||||||||||||||||
|
|||||||||||||||||||||||
| 's | trip to | 57 | 4.12 | 44.21 | < | 4.124 | |||||||||||||||||
| day | trip to | 47 | 3.40 | 47.61 | < | 3.401 | |||||||||||||||||
|
|||||||||||||||||||||||
| his | trip to | 35 | 2.53 | 50.14 | < | 2.533 | |||||||||||||||||
| first | trip to | 32 | 2.32 | 52.46 | < | 2.315 | |||||||||||||||||
| trip to | london | 31 | 2.24 | > | 26.48 | 2.243 | |||||||||||||||||
| trip to | a | 26 | 1.88 | > | 28.36 | 1.881 | |||||||||||||||||
| trip to | america | 24 | 1.74 | > | 30.10 | 1.737 | |||||||||||||||||
| trip to | new | 23 | 1.66 | > | 31.77 | 1.664 | |||||||||||||||||
Child phrases are displayed in decreasing order of frequency. Five columns of numbers follow each phrase:
Fetching explanation of Regular Expressions