Page tree

Dr. Pauline O'Shaughnessy

Intersect and NCI continue to meet attendees of our digital research training program partnership and seek their feedback. In October, we spoke with Dr Pauline O'Shaughnessy about her learning stories from participating in the NCI/Intersect Training Program. Pauline is a lecturer in Statistics in the School of Mathematics and Applied Statistics at the University of Wollongong. Her research focus is on Data Privacy protection using statistical methods. 

Pauline raised a very good point when asked about the reasons for attending the training program: “With the evolution of the computational technologies and pressing need for data analytics skills, researchers tend to fall behind if they do not upskill”. She attended the Data Manipulation and Visualisation in R and Introduction to Machine Learning using R training workshops and obtained knowledge and skills that are applicable to her research. 

Speaking about how impactful the courses, tools, and infrastructure are, Pauline considered our training courses very valuable to High Degree Research students and Early Career Researchers. “The introductory level High Performance Computing training is very useful and easy to digest, especially for people who come from a non-computer science background”, Pauline said. She also appreciated the free access to digital skills training workshops and the fact that it is bridging the gap between basic and advanced programming skills.

“Having free training is definitely a key because we don’t really have the luxury to invest a couple of hundred dollars to learn these research techniques. The NCI/Intersect training offers a systematic way of learning a programming language from scratch, filling the gaps between the basics and the domain-specific skills.”

In terms of training resources online, Pauline found the self-paced courses less engaging and distracting at times, whereas our instructor-led training is more preferable because attendees will commit time to learn from a narrative environment. “I'd rather the video (instructors) tell me what to do on the spot as it’s distraction free and you focus on what you've got and you get very much out of it. As with everything online (self-paced resources), you always had this mentality saying, I have opportunity (to do it later), but never really come back and read it”.  Pauline also expressed strong interest in attending face-to-face workshops in the future. 



Ke Ding

Ke Ding is a Mphil student from the Wen Group at John Curtin School of Medical Research. He joined NCI as Associate Training Officer in January this year.

"Being an associate training officer is a great opportunity for me to share my experience with HPC and help other researchers boost their projects with the latest supercomputer (Gadi)."

Ke's research projects focus on building DL models to uncover the interaction among RNA binding proteins and summarise transcription factor binding sites' characteristics, and he has been using Gadi's GPU nodes to train deep neural networks. In addition, since the size of genomic data is enormous, he needs a HPC to process the big data on.

"My supervisor, Associate Professor Jiayu Wen, showed me how to distribute a training task over multiple nodes on Gadi, which reduced the overall training time from nearly five days on my desktop to less than two hours on Gadi. I think everyone should be aware of the power of HPC and know how to take advantage of it."

Ke ran his first workshop to assist researchers in using Gadi to facilitate their DL projects in computational biology, and he is developing content to help people to run more advanced applications on Gadi. He is also on the 5th APAC HPC-AI competition committee. His job is to prepare AI tasks for the competition and provide technical support to the contestants.

"The competition aims to provide a platform for students to learn and use advanced HPC and AI technologies. It is my honour to contribute to this program. I am glad to see many students learn to use HPC through the competition. It will definitely benefit them in their future research."



Dr. Gavin Chapman

Intersect and NCI continue to meet attendees of our digital research training program partnership and seek their feedback. This month, we are excited to hear from Dr. Gavin Chapman, who is a developmental biologist at the Victor Chang Cardiac Research Institute (VCCRI). Gavin’s research is focused on understanding the genetic causes of malformations and he is now applying human genomics to identify genes that, when mutated, cause malformations such as congenital heart disease.

Gavin is using the Lightweight Analysis of Morphological Abnormalities, or LAMA - an automated pipeline for phenotyping mouse embryos. The LAMA pipeline is an easy way of producing volumetric data and provides statistical analysis. Researchers need to use Python to work with LAMA and initially they were using VCCRI’s in-house facility for data processing. The increasingly large amount of data generated has required them to implement the LAMA pipeline on NCI’s supercomputing infrastructure in order to take  advantage of the parallel computing power. 

Gavin participated in the Learn to Program: Python and Getting started with HPC using PBS Pro courses, and found them extremely useful. Gavin said, “The HPC course is so relevant to what I am trying to do and I appreciate the course material. It is a nice supplement to the documentation on the NCI gadi. In terms of data manipulation with pandas in Python, since I did not have much background of that, it was really good to learn what it is capable of. I have realised that I can use pandas to manipulate the data frames that are generated by LAMA.”

Gavin has highlighted how impactful and beneficial the machine learning courses would be for their research. 

“We aim to use machine learning to support the recognition of embryos that are terribly malformed, because LAMA is not really good at doing it. I have tried to learn the Linear Regression in books but I haven’t learnt much. By contrast, the same concepts were explained so well in the Intersect/NCI courses.”

When asked about the main reason he returned to attend the courses, Gavin said that, in the training, he was able to get a clear understanding of what the language or packages are capable of in a relatively short time frame. It is very different to some self-paced courses where people have to invest a lot of time into it. Gavin also mentioned that our course material is well developed and is an all-in-one reference with research-related examples, which normally are difficult to find online. 



Khuong Tran

Khuong Tran is a PhD candidate and a Research Assistant at the University of Technology Sydney. His research involves modelling AI-enabled agents to perform decision making under uncertainty using reinforcement learning and deep learning architectures. Khuong is enthusiastic about teaching and sharing his knowledge on digital tools and technologies (such as programming, and applied machine learning) and has been an eResearch Trainer at Intersect for 3 years. Khuong has delivered over 14 courses throughout the NCI/Intersect training program and we asked him to share some of his teaching stories from the program.

In terms of his motivation for being an eResearch Trainer at Intersect, Khuong told us that Intersect provides a great opportunity for him, as a PhD student, to exercise his knowledge by teaching people how to use digital tools. As an international student, this job also provides extra support for his study and life in Australia. “I have also got a chance to talk to researchers from different universities and institutions and build networks when delivering the courses”, Khuong also mentioned. 

“Thanks to the popularity of Python frameworks, like Numpy, Pandas, Seaborn and Scikit-learn, people with no or minimal background in programming can jump in and start to manipulate high dimensional arrays or perform visualisation with just a few lines of code. These things were not easily accessible [when I started my PhD.]” 

Khuong emphasised that the Intersect courses facilitate the implementation of such tools in research - “We are not teaching all the theories of programming. Instead, we raise the awareness of the tools/technologies that are ready to use, and teach how to read the documentations and to utilise them immediately after the courses. This is extremely beneficial for researchers who have no background in Computer Science or Engineering. ”

“It is encouraging to see that more researchers from, for instance the Social Sciences or the Business School, are adapting the programming and HPC tools for their research”, Khuong further added. He acknowledged the contribution of Intersect courses in terms of helping researchers build their confidence to transition from conventional tools to more cutting-edge technologies, resulting in significant improvement to their research productivity. 

“In terms of machine learning (my research field), there are many tutorials out there and as a beginner, it can be hard to pick the appropriate learning material. The Intersect machine learning courses, normally delivered in three consecutive weeks, cover the fundamentals, terminologies, workflows, and more importantly, the applications of the relevant packages and functions. From my perspective, this would otherwise take beginners several months through the self-paced training in order to reach the same level of understanding of Intersect courses.”



Professor Attila Mozer

Following our first meeting with researchers from Australian Nuclear Science and Technology Organisation, we spoke to Professor Attila Mozer from the University of Wollongong to seek his feedback on the impact of the NCI/Intersect training program. Professor Mozer is a chemical engineer, a physical chemist and a laser spectroscopist. He is currently a chief investigator at ARC Centre of Excellence for Electromaterials and the project leader of Characterisation within the 3D Electromaterials theme.

Professor Mozer told us that the inaccessibility of the usual support from collaborators during the COVID was one of the reasons for attending our training programs; he could learn and utilise the tools and codes available through courses. “You don't need to be a developer to use it. You can just use it and it is very reliable to the level that I can actually find it useful,” Professor Mozer explained. He also highlighted that the first Intersect/NCI Collaborative Course on HPC and Data in Materials Design and Discovery in 2021 motivated him to explore more courses offered by Intersect and NCI, and he later found our training program offerings.

When asked about the difference to other training available and the reasons for returning to our training program, Professor Mozer emphasised course structure and the practicality of the knowledge acquired during courses.

"I kept coming back to this training program because the courses were nicely organised.  I've done all the Python machine learning courses, because I'm very interested in that. I haven't done any research yet, but I want to, so that was the motivation to try to understand clearly and practically how to do it, not theoretically. There was also a logical progression from one course to the other, so that's why I didn't want to miss any of it. The training is very accessible and I can learn some practical relevance to what I'm doing." 

Professor Mozer also commented on how impactful the tools/technologies are in his research; “I think everybody should get some kind of training and understand what's possible. The tools and technologies we learnt during the course as well as the NCI infrastructure have a massive impact in every material science.”

When speaking about the benefits of the hands-on, instructor-led training, as compared to self-paced training, Profession Mozer added, “The practicality of the hands-on training is very important and so is troubleshooting. The instructors help with errors on the spot whereas you could spend hours finding a little typo doing self-paced training.”



Fred Fung

       

"Working as a training officer at NCI, there's no better position for me to engage with our community to help develop robust, scalable and sustainable software." 

Fred joined NCI in October last year when he was writing his PhD thesis.

"I'm grateful to undertake this role. Working for a supercomputer facility like NCI was always sitting at the back of my mind. It's where my training naturally grows into."

Majoring in Mathematics through undergraduate study at ANU and following his interest in computational mathematics in his PhD, Fred specialises in fault-tolerant HPC applications.

"While the computation power grows every year, it comes with ever greater energy demands. Solving a petascale matrix problem in an inefficient way can cost as much energy as is used by all the households in Canberra. Ideally, from both the user and the facility's perspective, we want to have fault recovery mechanisms in case of fault occurrences during program execution."

Fred is currently developing the NCI HPC Toolkit for NCI users, a training module that contains a series of parallel APIs and programming models, including OpenMP, MPI and CUDA.

"When I was working on my PhD, my supervisor Associate Professor Linda Stal's program, a 300k C++ parallel code written back in the 90s and used to run on the then supercomputer Fujitsu AP1000, I was amazed that it still compiled and ran smoothly on Gadi, our latest machine, owing to its portable and sustainable implementation. Likewise, my passion is to help our users devise efficient codes that hopefully will run in the future."



Talk to the Australian Nuclear Science and Technology Organisation (ANSTO)

     Mr Nicholas Howell         Dr. Paul Callaghan            

Since October 2021, Intersect Australia and the National Computational Infrastructure (NCI) have collaborated to deliver a comprehensive training program, which aims to upskill NCI HPC users on cutting edge digital technologies and tools, including Python, R, and Julia programming; machine learning; research computing; and parallel programming. To date, this program has trained over 600 researchers in 27 courses across 40+ institutions and organisations across Australia. 

In order to better understand how the NCI/Intersect training program has supported the research and professional career of the participants, we arranged to meet with several of the attendees to capture their feedback. This would demonstrate and elucidate how the training program is helping researchers and professionals to more effectively and efficiently utilise cutting edge tools and technologies, as well as NCI’s infrastructure, for their research. 

Our first meeting was with two researchers from the Australian Nuclear Science and Technology Organisation (ANSTO).  Dr. Paul Callaghan  is an Imaging Neurophysiologist, who applies PET/CT and SPECT/CT multimodal molecular imaging techniques for investigating animal models of neurological and psychiatric disease. Mr Nicholas Howell is a biologist also working within the Nuclear Science & Technology group. 

When asked about the main reason for attending this training program, Paul told us that “completely inappropriate tools have been and are being used to do the job because of limitations in knowledge. We needed to upskill so we can actually even do a small scale deployment.” Nicholas added that he attended the course so he could effectively work on the same problem with the colleagues at ANSTO.  

“We need to train ourselves in which tools are appropriate for each of the tasks, for example, scripted image processing workflows (using Python/ImageJ), pre processing of data from analytical instruments (for example radiation counters, microplate readers, Python or R), and data visualisation and statistics in R/Python (in particular, using existing code for tests not available in ‘easy’ commercial statistical packages using R).” - Dr Paul Callaghan & Nicholas Howell

Both have attended five different workshops under this training program, so we were interested to know what kept them coming back for more. Nicholas has done both free and paid online courses, which were quite good, but he mentioned that “ the NCI/Intersect courses are really relevant to what I am trying to do. I really enjoyed the training format with live questions and found that difficult concepts have been stepped through very nicely in the courses. I've been getting a lot out of them, so that's why I keep coming back.”. 

Paul has also attended other online courses and found these courses often switched between concepts without explaining why, leaving attendees confused. “You are coding. You're not programming at our stage of the career and many of the other courses are programming courses”, Paul emphasised. Paul also said about the NCI/Intersect courses that “I was able to achieve some tasks using that course material and then build upon it with the next one afterwards.” 

Paul and Nicholas both agreed that the toolset that's being taught in NCI/Intersect courses is entirely valid and well-focused for the deployment and development of their models, which is very different from many of the other online courses.

“The breadth of the NCI/Intersect courses in first the principles of the tools, then the use of that tool for data wrangling, and subsequently visualisation fits very well with our needs. Firstly, we have seen where we need to be, and the NCI/Intersect courses are a way to get there. Both myself and Nick have also done other courses (datacamp, EdX etc), and the focus of the NCI/Intersect courses has led to us being able to leave the training, and able to ‘do’ immediately, as the trainers really understand the differing needs compared to a more generic EdX course.” - Dr Paul Callaghan & Nick Howell

Speaking about the benefits of our hands-on, instructor-led workshops on attendees’ research careers, Nicholas recognised that attendees can be taught, but they don't truly learn until they have an opportunity to practice with the application. In response to what would be missing if they did not have access to these courses, Paul was concerned about the new starters (HDR students, researchers, and scientific staff), who are not trained and are not from a computational background. He said “I think we're doing those students a disservice to force them into using inappropriate tools and we're not going to be competitive without access to the training of the next generation, and it just reinforces the problem as people get busy”. Paul also suggested that it would be helpful, at the university level, to introduce basic concepts of data handling and data management.

This interview and article are jointly prepared by NCI and Intersect.




  • No labels