Dynamically generating T32 training documents using structured data
Keywords:T32 Training Grants, National Institutes of Health, Author Disambiguation, Administrative Burden
Background: The US National Institutes of Health (NIH) funds academic institutions for training doctoral (PhD) students and postdoctoral fellows. These training grants, known as T32 grants, require schools to create, in a particular format, seven or eight Word documents describing the program and its participants. Weill Cornell Medicine aimed to use structured name and citation data to dynamically generate tables, thus saving administrators time.
Case Presentation: The author’s team collected identity and publication metadata from existing systems of record, including our student information system and previous T32 submissions. These data were fed into our ReCiter author disambiguation engine. Well-structured bibliographic metadata, including the rank of the target author, were output and stored in a MySQL database. We then ran a database query that output a Word extensible markup (XML) document according to NIH’s specifications. We generated the T32 training document using a query that ties faculty listed on a grant submission with publications that they and their mentees authored, bolding author names as required. Because our source data are well-structured and well-defined, the only parameter needed in the query is a single identifier for the grant itself. The open source code for producing this document is at http://dx.doi.org/10.5281/zenodo.2593545.Conclusions: Manually writing a table for T32 grant submissions is a substantial administrative burden; some documents generated in this manner exceed 150 pages. Provided they have a source for structured identity and publication data, administrators can use the T32 Table Generator to readily output a table.
Beck WT, Ablordeppey SY, Elmquist WF, Galt KA, Malone DC, Mount JK, Miller KW. Impact of the NIH roadmap on the future of graduate education in colleges and schools of pharmacy: report of the 2004–2005 Research and Graduate Affairs Committee. Am J Pharm Educ. 2005;69(5):S18. DOI: http://dx.doi.org/10.5688/aj6905S18.
US Department of Health & Human Services. NIH Research Portfolio Online Reporting Tools (RePORT) [Internet]. The Department [cited 30 Dec 2018]. <https://projectreporter.nih.gov/Reporter_Viewsh.cfm?sl=12EFCE0B478EC3D27598B8961CAA4A01A2FFCEB861BF>.
National Institutes of Health Office of Extramural Research. Data tables [Internet]. The Institutes [cited 30 Dec 2018]. <https://grants.nih.gov/grants/forms/data-tables.htm>.
National Research Council (US), Committee for the Assessment of NIH Minority Research Training Programs. Assessment of NIH minority research and training programs: phase 3. Washington, DC: National Academies Press; 2005. xi, 227 p. ISBN: 978-0-309551847.
Smith-Yoshimura K, Altman M, Conlon M, Cristán AL, Dawson L, Dunham J, Hickey T, Hook D, Horstmann W, MacEwan A, Schreur P, Smart L, Wacker M, Woutersen S. Registering researchers in authority files [Internet]. OCLC Research [cited 30 Oct 2017]. <http://www.oclc.org/content/dam/research/publications/library/2014/oclcresearch-registering-researchers-2014.pdf>.
Johnson SB, Bales ME, Dine D, Bakken S, Albert PJ, Weng C. Automatic generation of investigator bibliographies for institutional research networking systems. J Biomed Inform. 2014 Oct;51:8–14. DOI: http://dx.doi.org/10.1016/j.jbi.2014.03.013.
Tian F, DeWitt DJ, Chen J, Zhang C. The design and performance evaluation of alternative XML storage strategies. ACM Sigmod Record. 2002 Mar;31(1):5–10. DOI: http://dx.doi.org/10.1145/507338.507341.
Albert PJ. T32 Table 5 generator. DOI: http://dx.doi.org/10.5281/zenodo.2593545.
Albert P, Bales M. VIVO Dashboard - a semantic application for visualizing publication data [Internet]. [cited 29 Apr 2019]. <https://github.com/wcmc-its/vivodashboard>.