Whether you are planning to collect your data via a survey, from patient or medical records, or other means, SDBC collaborators can review survey or other your data collection tool prior to data collection. This will help ensure that your data collection will be comprehensive and include the appropriate variables to meet your study objectives. * Appropriate collection and formatting of your data is crucial for facilitating statistical analysis.* While there are a number of database management systems to choose from, the use of spreadsheets such as Excel for data entry and storage is usually not a good idea. Below are suggestions to help you choose the right database management system: For those conducting surveys or focus groups, other data collection issues are involved including: For studies involving the Utah Population Database an initial meeting should be scheduled with both the SDBC and UPDB staff to determine the most appropriate database to meet the investigator’s needs For studies involving the Enterprise Data Warehouse (EDW) an initial meeting should be scheduled with both the SDBC and EDW staff to determine the most appropriate database to meet the investigator’s needs. A data dictionary provides a list of the variable names in your database, a brief summary of what each variable is, as well as the range of possible values that each variable may take. This helps the statistician read through your database easily and also allows them to perform data integrity checks. A data dictionary is also an excellent way to keep track of your variables, especially if abbreviations are used in your database. If you have any questions about the data dictionary, feel free to ask a statistician. See below for an example: All variable names should have the following properties: Formatting your data will expedite your statistical analysis. While REDCap implements many critical aspects of data formatting, it may be helpful to review the following before you begin. Specific formats for longitudinal or repeated measures studies: All data should be securely stored, and access should be restricted to those individuals who are entering data. Properly dispose of paper and electronic files, keep paper copies in a locked cabinet or drawer, and store electronic files on a secure-access central server. Also keep in mind the Health Insurance Portability and Accountability Act’s (HIPAA) Minimum Necessary Principle when listing what variables to include in a database. Use or disclose only the information necessary to the task. Excluding unnecessary items that make information identifiable is an important step to ensure privacy, security, and patient confidentiality. Identifiable information includes items listed below. If identifiable information is necessary for research (e.g. birth date, visit date, physical address), take necessary precautions to protect the database: strong passwords, anti-virus software, data backup, possibly encryption, and being very cautious with email. Please refer to COMIRB and HIPAA for additional stipulations. List of Identifiable Information After the data is collected, we will implement the analyses dictated in the statistical analysis plan (SAP) that we developed collaboratively with you prior to the data collection stage. There may also be additional follow-up analyses that are requested after viewing results from the SAP. These additional analyses will be delineated from pre-specified analyses in the SAP in an addendum section. We are also available to help with the presentation of methods and results in your manuscript – please see the next section "Publish" for details! If you wish to conduct your own statistical analysis, please indicate this when you contact the SDBC (via our online Request for Collaboration form). We can connect you with statistician(s) who are experienced in teaching statistics and statistical programming. If you plan to conduct your own statistical analysis, the following resources may be helpful:
Data Collection & Analysis
Data Collection Details
Research Electronic Data Capture (REDCap)
Data Dictionary
Variable Name
Description
Valid Values
Gender
Patient's gender
M, F or 1=M, 2=F
Survmon
Patient's survival in months
0.001 - 160
Surg
Patient's surgery status
1=Yes, 2=No or Yes, No
Race
Patient's Race
AA=African American, H=Hispanic...
Age
Patient's Age
11 - 35
Variable Name Requirements
Data Formatting
Protected Health Information (PHI)
Analysis
Conducting Your Own Statistical Analysis
Our Office
Williams Building
University of Utah Research Park
Williams Building, 1st floor
295 South Chipeta Way
Salt Lake City, Utah
Map
Parking: During construction, you may park on the bottom floor of the south parking structure.
Contact
Camie Derricott
Camie.Derricott@hsc.utah.edu
Acknowledging the SDBC
Please use the following text to acknowledge the CTSI Study Design and Biostatistics Center:
"This investigation was supported by Translational Research: Implementation, Analysis and Design (TRIAD), with funding in part from the National Center for Advancing Translational Sciences of the National Institutes of Health under Award Number UM1TR004409. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health."
"This investigation was supported by the Study Design and Biostatistics Center (SDBC), with funding in part from the National Center for Advancing Translational Sciences of the National Institutes of Health under Award Number UM1TR004409. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health."