Back

Forum Replies Created

Viewing 6 reply threads
  • Author
    Posts
    • #36396
      Andrew Hall
      Participant

      Hi, I like the way you used color to differentiate sections.

      I would suggest listing the different options for race with a checkbox for each. It’s good that you have the “Other” option for race.

    • #36395
      Andrew Hall
      Participant

      Ashara, this CRF is very well organized and the response options are well detailed!

      I’m not sure how the subject initials and physician signature would line up with privacy protocols. That could possibly be identifying information, right?

    • #36278
      Andrew Hall
      Participant

      I would suggest adding a check box in the temperature question with the letter “F” for Fahrenheit, as per the CDISC. The user could then choose “C” or “F” for the unit. If one of the study sites in the US, body temperature analysis likely will default to Fahrenheit.

    • #36277
      Andrew Hall
      Participant

      One benefit of having data standards for clinical research is that it gives data scientists and clinicians a common language with which to address the subject matter. If researchers and analysts from different disciplines can overcoming siloing as mentioned in the lecture, then the pace of analysis will accelerate. Data from different studies will be accessible to data scientists with comprehensive data standards.

    • #36276
      Andrew Hall
      Participant

      I don’t have experience with a study in a medical or scientific context. However, in my experience working with web server logs that I mentioned last week, I have a great deal of experience with the time stamp, user authentication, and edit check elements. Server logs always have a time stamp for each HTTP request to the web server. Often, the time stamp is in UTC time, which required me to do some mental math to determine the time for the requests I was looking for because I’m located on the east coast of the United States which can be four or five hours behind UTC time depending on the time of year. We controlled user access and authentication through my organization’s help desk. The help desk would authorize users to access the database through the addition of the database using the Okta authentication and security software. With my supervisor’s request, the help desk added the database icon to my Okta dashboard, which allowed me to access the database. The database itself used the Splunk framework to store and query data. The query process in Splunk had edit checks to make sure the input variables met syntax standards. The search interface facilitated syntax adherence through an autocomplete drop-down for variable options, if I remember correctly. I assume we had a data backup and recovery plan but I was unaware of one.

    • #36050
      Andrew Hall
      Participant

      Most of my experience in the data management workflow has been with the data manipulation and analysis step. My experience with this step comes from my role as an engineer analyzing server logs to determine the root cause of an error. I manipulated the variables in the query to best fit the parameters of the error investigation. I then analyzed the results to deduce the possible root cause of the error.

      I don’t have experience with the two ends of the data management workflow. As a technical-level engineer, I don’t have experience with the conception and creation phase of data management nor with the data entry phase. Since I acted immediately on the data, I don’t have any experience with the data reporting and archiving phases.

      If I could go back, I would have tried to get more experience with the data entry and processing as well as the data validation and quality control steps. I would have been a more effective diagnostician if I had more familiarity with how the database stored and processed the specific data points, as I often struggled to write effective queries.

    • #36049
      Andrew Hall
      Participant

      1. In my last role, I worked as a development operations engineer (systems administration with cloud computing) for a software startup that produced a content management system. One of my duties was to investigate the root cause of automated error alerts we received so we could troubleshoot the error. Every time a user visits a website, the web server logs a record of that visit or HTTP request. Each HTTP request has a response code. I investigated negative response codes and other variables within the server logs to see what requests errored and diagnose the issue with the attached error message.
      2. I queried secondary data in a database of server logs. Servers automatically log each request and a team of technicians funneled the logs into a searchable database.
      3. My primary means of data collection was searching for data with detailed queries in the server log database. The search terms in the queries were variables relevant to the time period, website, and error I searched for each case.
      4. The most significant problem for data collection was the sheer size of the collection of server logs. Querying the database was a slow process because the data load was so large and our underlying database resources were ineffective for a data load of that size. We were able to make some improvements to the speed of the database over time.

    • #36394
      Andrew Hall
      Participant

      Pawdoo, thanks for your feedback! I think the normal/abnormal options are a good idea. I intended the bullet point to be a check box. The normal/abnormal/not done options are more specific.

    • #36389
      Andrew Hall
      Participant

      Thank you Taro! Per your and Dr. Saranath’s feedback, I will make the “Screening” the title of the table, as to include all subsequent questions.

    • #36313
      Andrew Hall
      Participant

      Thanks! I’ll learn more about those packages.

    • #36283
      Andrew Hall
      Participant

      We should along with the choices for race include a “specify other” option, right?

    • #36281
      Andrew Hall
      Participant

      Pooling data to analyze for patterns in drug safety is a good point. It could facilitate faster knowledge and action on adverse events with a new drug, reducing the number of patients who have adverse events.

    • #36280
      Andrew Hall
      Participant

      That’s great you used R in your analysis! Did you use any R packages that were specifically designed for biological applications?

    • #36153
      Andrew Hall
      Participant

      Being new to health disciplines in general, I appreciate that you all are teaching me about the different types of research! Often I think the focus in data science is on data sets that are decontextualized from their knowledge domains and sources, such as clinical trials or basic research.

    • #36152
      Andrew Hall
      Participant

      I find the analysis of qualitative data interesting in the context of data science, where often analysts will use algorithms to analyze quantitative data in order to make predictions. Integrating qualitative data into data science and machine learning is something I’m interested in learning more about. Perhaps sentiment analysis using natural language processing is the closest use case.

Viewing 6 reply threads