Problem in Kaggle with section 3.1 of baseline code

Hi all, I hit a bump in the road.

I went through the process of registering and certification to get to the data. Great experience and learning about latest research protocols.

I used Kaggle due to limited access to technical resources. So, the code work fine as I switch to a lower model version and also decreased the batch sizes. The model performed not too bad with an accuracy of 87%.

The section ‘3.1. Load the Knowledge Base’, in particular with this line of code:“SG = SnomedGraph.from_serialized(”…/snomed_graph/full_concept_graph.gml")",

I kept on receiving an error that the file don’t exist. Also note, I checked the script, i.e.,, and there is no class or function or output called 'full_concept_graph.gml.

Also note that i created the script for and import it as a library, as required. Can anyone please direct me to the correct way to solve this. As I am eager to complete the baseline sub and continue.

Thank you


Hi @Augustine,

The snomed_graph utility was created to help participants load the SNOMED Clinical Terminology into a graph data stucture and then to query it.

In section 3.1 of the notebook, the commentary shows you how to create a graph from the raw SNOMED RF2 files (which are structured as tab-delimited, tabular data files) if you haven’t got a prebuilt GML file already.

The line:

SG = SnomedGraph.from_rf2("SnomedCT_InternationalRF2_PRODUCTION_20230531T120000Z_Challenge_Edition")

… will build you a graph. It’ll take a few minutes to complete. You can then save the graph and reload it locally using the notebook code.

For more details on how to use the snomed_graph package, see the “usage” notebook in the project’s homepage: snomed_graph/usage.ipynb at main · VerataiLtd/snomed_graph · GitHub

Wishing you the best of luck.


1 Like


Thank you for the speedy feedback, much appreciated.