{"id":361,"date":"2015-03-20T09:53:24","date_gmt":"2015-03-20T09:53:24","guid":{"rendered":"http:\/\/www.gallop.net\/blog\/?p=361"},"modified":"2018-10-03T16:24:18","modified_gmt":"2018-10-03T10:54:18","slug":"2-major-challenges-of-big-data-testing-2","status":"publish","type":"post","link":"https:\/\/www.cigniti.com\/blog\/2-major-challenges-of-big-data-testing-2\/","title":{"rendered":"2 Major Challenges of Big Data Testing"},"content":{"rendered":"
We all know that there are umpteen number of challenges when it comes to Testing \u2013 lack of resources, lack of time, and lack of testing tools. The industry has faced, probed, discovered, experimented and found its way out of most of the challenges of data testing. Having trumped so many challenges you would think developers can now sit smug and relax.<\/p>\n
Not really. Those many challenges were just small fry when compared to the BIG one. We are of course talking about the BIG problem that the industry is currently wrestling – Big Data Testing. What are these challenges then?<\/p>\n
Why Big Data testing is more challenging than other types of data testing is because unlike normal data which is structured and contained in relational databases and spreadsheets, big data is semi-structured or unstructured. This kind of data is contained in database rows and columns which makes it that much harder. To top it all, just testing in your own time frame isn\u2019t enough. What the industry needs today is real-time <\/em>big data testing in agile environments. Large scale big data technologies often entail many terabytes of data. Storage issues aside, testing these Terabytes that usually take servers many months to import, in the short development iterations that are typical of an agile process, is no small challenge.<\/p>\n So let\u2019s look at how this can impact two of the many facets of Testing:<\/p>\n 1. Automation<\/strong><\/p>\n Automation seems to be the easiest way out in most testing scenarios. No scope for human error! That seems very appealing when you\u2019ve faced some painful \u2018silly\u2019 mistakes that can mess up your codes big time. But there are a few challenges here:<\/p>\n Expertise: To set up automated testing criteria requires someone with quite a bit of technical expertise. Now, Big Data hasn\u2019t been here long enough to have seasoned professionals who have dealt with the nuances of testing this kind of data.<\/p>\n Unexpected glitches: Automated testing tools are programmed to scope out problems that are commonly expected. Big data, with its unstructured and semi-structured format can spew out some unprecedented problems that most automated testing tools are not equipped to handle.<\/p>\n More Software to Manage: To create the automation codes to manage unstructured data is quite a task in itself, creating more work for developers which misses the whole point of Automation!<\/p>\n 2. Virtualization<\/strong><\/p>\n This is one of the integral phases of testing. What a great idea to test the application out in a virtual environment before you launch it in the real world? But then again, here are the challenges:<\/p>\n Virtual machine latency: This can create timing problems, which is definitely not something you want, especially in real time big data testing. As it is, fitting in big data testing in an agile process is already a herculean task!<\/p>\n Management of images and the VM: Terabytes naturally gets more complicated with images. Seasoned testers know the hassles of configuring these images on a Virtual machine. To add to this, there is that matter of managing the Virtual Machine on which these tests are to be run!<\/p>\n