An alternative to using the MongoDB Connector for Hadoop is to use the programming language of our choice to export data from Hadoop, and then write into MongoDB using the low-level driver or an ODM, as described in previous chapters.
For example, in Ruby, there are a few options:
- WebHDFS on GitHub, which uses the WebHDFS or the HttpFS Hadoop API to fetch data from HDFS
- System calls, using the Hadoop command-line tool and Ruby's system() call
Whereas in Python, we can use the following:
- HdfsCLI, which uses the WebHDFS or the HttpFS Hadoop API
- libhdfs, which uses a JNI-based native C wrapped around the HDFS Java client
All of these options require an intermediate server between our Hadoop infrastructure and our MongoDB server, but, on the other hand, allow for more flexibility in the ETL process of exporting/importing data.