I just finished the data science with python track on data camp and I am wondering if this competition is something fits my current standing to start with? if so, is there anything I can read to get me started? I am just looking for guidance not worried about the score/money currently. Thanks in advance
In my opinion this is a difficult competition to start as it is multimodal, you need to combine two modalities (text and image).
For me a good starting point is to go to the hugging face library and finetune a model like DistilBert just with the text. Then you may finetune a pretrained ResNet on images and combine both outputs with a mlp.
You can read a lot of useful information here https://arxiv.org/abs/2005.04790
And the winners code is also available
I really appreciate your time and answer, I am looking at these resources now
Personally, I think this competition could be tough for newbies. But, if you are interested, here are the prize-winning solutions and their papers: