This data set includes just less than 600 items (articles) for a total of 2.4 million words. Counts and frequencies of ungrams, bigrams, and computed keywords are visualized below. Ask yourself, "To what degree is this corpus large, and what is this corpus about?
![]() unigrams |
![]() bigrams |
![]() keywords |

network of keywords and articles
| labels | weights | features |
|---|---|---|
| project | 0.80364 | project system development work users process time software libraries services data systems |
| code | 0.25247 | code libraries work community editorial people articles technology source time software authors |
| search | 0.1771 | search data results query information users api google discovery database result searches |
| records | 0.17556 | records record data marc field script title metadata fields code name process |
| content | 0.14595 | content web libraries google site html link code links pages item website |
| collections | 0.09145 | collections metadata archival web content archives islandora images objects information description archive |
| metadata | 0.08229 | metadata data repository object objects xml name type access fedora files identifier |
| mobile | 0.08018 | mobile app students web location reference devices information application map computer system |
| data | 0.07648 | data name metadata model marc records work information rdf bibliographic web frbr |
| text | 0.06973 | text data word words model analysis results terms models language research dataset |
| api | 0.06763 | api data code web request services form information xml link server name |
| files | 0.06299 | files video preservation format disk images media storage data audio content version |
![]() topics |
![]() topics over time |