Build Status

DISCLAIMER: This application is used for demonstrative and illustrative purposes only and does not constitute an offering that has gone through regulatory review. It is not intended to serve as a medical application. There is no representation as to the accuracy of the output of this application and it is presented without warranty.

Classify medical diagnosis with ICD-10 code

This application was built to demonstrate IBM's Watson Natural Language Classifier (NLC). The data set we will be using, ICD-10-GT-AA.csv, contains a subset of ICD-10 entries. ICD-10 is the 10th revision of the International Statistical Classification of Diseases and Related Health Problems. In short, it is a medical classification list by the World Health Organization (WHO) that contains codes for: diseases, signs and symptoms, abnormal findings, complaints, social circumstances, and external causes of injury or diseases. Hospitals and insurance companies alike could save time and money by leveraging Watson to properly tag the most accurate ICD-10 codes.

This application is a Python web application based on the Flask microframework, and based on earlier work done by Ryan Anderson. It uses the Watson Python SDK to create the classifier, list classifiers, and classify the input text. We also make use of the freely available ICD-10 API which, given an ICD-10 code, returns a name and description.

When the reader has completed this pattern, they will understand how to:

Flow

architecture

  1. CSV files are sent to the Natural Language Classifier service to train the model.
  2. The user interacts with the web app UI running either locally or in the cloud.
  3. The application sends the user's input to the Natural Language Classifier model to be classified.
  4. The information containing the classification is returned to the web app.

Included Components

Watch the Video

video

Steps

  1. Clone the repo
  2. Create IBM Cloud services
  3. Create a Watson Studio project
  4. Train the NLC model
  5. Run the application

1. Clone the repo

Clone the nlc-icd10-classifier repo locally. In a terminal, run:

git clone https://github.com/IBM/nlc-icd10-classifier
cd nlc-icd10-classifier

2. Create IBM Cloud services

Create the following service:

3. Create a Watson Studio project

4. Train the NLC model

The data used in this example is part of the ICD-10 data set and a cleaned version we'll use is available in the repo under data/ICD-10-GT-AA.csv. We'll now train an NLC model using this data.

5. Run the application

Follow the steps below for deploying the application:

Run on IBM Cloud

Deploy to IBM Cloud

Run locally

The general recommendation for Python development is to use a virtual environment (venv). To install and initialize a virtual environment, use the venv module on Python 3 (you install the virtualenv library for Python 2.7):

Sample Output

The user inputs information into the Text to classify: text box and the Watson NLC classifier will return ICD10 classifications with confidence scores.

Classification of Gastrointestinal hemorrhage: Sample output

Links

Learn more

License

This code pattern is licensed under the Apache Software License, Version 2. Separate third party code objects invoked within this code pattern are licensed by their respective providers pursuant to their own separate licenses. Contributions are subject to the Developer Certificate of Origin, Version 1.1 (DCO) and the Apache Software License, Version 2.

Apache Software License (ASL) FAQ