Recently, our social network analysis methodology hit a snag as the computer I am using started to crash when attempting to process our larger data sets. The data sets are not extremely large at this stage (approx 8MB Excel sheets with about 80 000 lines of text), but nonetheless too big for my MacBook Pro to handle. Just to remind you, we are using Gephi as our analytics software (open source)
I started looking into virtual servers where Amazon EC2 Virtual Servers are the benchmark in this domain. They seem to be located in Northern America, i.e. San Francisco, and I have been advised the geographical location of Amazon is good when scraping data from technology companies like Twitter and Facebook, who also host their data in a similar geographical area. However, Amazon does appear to be a little too expensive for the research budget – although very tempting to wind some servers up to collect and process our data quickly.
The second option was to lean on the national super computer infrastructure for Australian researchers, NeCTAR. I established two medium virtual servers (2 vCPU, 8GB RAM, 60GB local VM disk), installed a Ubuntu operating system, but had difficulty in talking with the system (happy to take input from anyone here).
Then, we had a meeting with Information and Communication Technology (ICT) people at the University of Sydney who have been very helpful in their approach. We have been liaising with Justin Chang who provided us with an improved version of Gephi that essentially enables us to use more RAM on my local machine to process the data sets. Justin provided me with a disk image that I installed, tested and was able to get moving with the analysis again.
I asked if I could share the Gephi with our readers, to which he agreed – and provided a step by step on how he created an improved RAM allocated version of Gephi:
- Download the ‘Gephi’ .dmg frill from: https://gephi.org/users/download/
- Open the .dmg file
- Copy the Gephi.app file to a folder on your desktop
- Ctrl + Click the Gephi.app file and click Show Package Contents
- Navigate Contents > Resources > Gephi > etc and open the gephi.conf file in a text editor
- Change the maximum Java RAM allocation:
default_options=”–branding gephi -J-Xms64m -J-Xmx512m -J-Xverify:none -J-Dsun.java2d.noddraw=true -J-Dsun.awt.noerasebackground=true -J-Dnetbeans.indexing.noFileRefresh=true -J-Dplugin.manager.check.interval=EVERY_DAY”
default_options=”–branding gephi -J-Xms1024m -J-Xmx2048m -J-Xverify:none -J-Dsun.java2d.noddraw=true -J-Dsun.awt.noerasebackground=true -J-Dnetbeans.indexing.noFileRefresh=true -J-Dplugin.manager.check.interval=EVERY_DAY”
This enables Gephi to utilise up to 2GB RAM when processing data, you can allocate any amount of RAM here (as long as it is less than your systems RAM resources)
- save the file
- run the application ‘Disc Utility’
- from within Disc Utility click file > new > Disk Image from Folder and select the folder that you created on the desktop and then click Image.