    Relate 360 Architecture

      I'm trying to setup a fresh install of Relate 360 using AWS EMR. The installation and configuration guide call out how to install the package but it doesn't call out a guiding infrastructure architecture footprint should be built.


      Should Relate 360 be installed from an instance independent of EMR? Should the installation be included in the EMR bootstrapping? EMR does the setup of zookeeper, hadoop, hive, etc. Is EMR a separate component that is simply hooked into Relate 360?



      Any diagrams or guiding documentation for infrastructure architecture setup?

          Relate 360 needs to be installed only on one node(which is usually recommended to be installed on edge node).


          Once installed, when you run the Relate 360 jobs, we submit jobs to yarn so as long as from edge node hadoop services are accessible, it should be fine.


          Supported version as per 10 Hf9 relate 360 PAM is EMR 5.14.


          Regarding architecure, depending on data volume, you can have hadoop cluster of your requirement. Minimum service needed for Relate 360 jobs to run is zookeeper, yarn, hive, hbase, hadoop.


          Please let me know if you are looking for anything more.



            This is an old architecture diagram. Check if this helps