Four Programming Languages Creating a Complete Website Scraper Application: CSharp, VB.Net, Java, Visual Basic for Applications

Link Em Up, Publishing div
1
Free sample

After finishing these pages you will have a complete application which will work for either console or desktop platform. You will be utilizing three languages - C#,VB.Net and Java for creating this application. Each chapter covers a single language and either the desktop or console application coded in that language (Java does not natively allow a console application, so it includes only Desktop). For console program automation purposes, we will be using an Excel sheet and VBA coding. Using the desktop application allows for more flexibility in web page processing, with entry fields for beginning and ending text along with DIVs and other processing options. Enjoy this learning experience.

This list includes some of the types/commands and the languages that use them

WebResponse, WebRequest, HttpWebRequest, StreamReader (C#/VB)

GetResponse, Regex.Replace, String.Replace, IndexOf (C#/VB)

Substring, ReadLine, Trim, WriteLine (C#/VB)

EndsWith, AddRange, ReadToEnd, Count (C#/VB)

GetCommandLineArgs, GetResponseStream (VB)

getText, endsWith, split, length, openConnection (Java)

toString, BufferedReader, getSelectedIndex, replaceAll (Java)

isEmpty, substring,indexOf, readLine, PrintWriter, write (Java)

ActiveCell,Value,ChDir,Shell,Activate (VBA)

Why would you want to work with the same program in multiple languages? A simple answer to this is "versatility." You may come across a need for Java where a .Net-based language just won't work. A perfect example of this is Windows versus Linux web hosting. If you have designed a .Net program and placed it on your site based on Windows, it will work beautifully. If you then change the hosting plan to Linux, the .Net program will not work without some tweaking or an interpreter. If that were written in Java, however, it would have moved over fine.

Why would you want a web site text extraction program? Well, if you had a need to capture the main text from a few web pages, this would be too much trouble. If you are migrating a web site designed in ASP.NET into another format, maybe a CMS, this approach can be quite useful. If you have 1,000 pages in the site and all are similarly structured, it may take a week for a single person to manually copy and paste the body text from these pages. Using the automated approach, with a pause between each page for accuracy purposes, approximately 700 pages per hour can be processed. That equates to a tremendous labor savings.

Read more

About the author

Stephen J. Link is a “computer guy” by profession, an author by hobby, and a Layman in the study of God’s Word. He has a computer support book entitled “Link Em Up On Outlook” that was published in 2004 as a paperback (renamed to "Power Outlook" in reprint). He also has over 125 articles covering various topics published on his own blog and independent sites. Various Books have been published covering a number of topics. As a programmer, he has a unique approach to help you master the ability to create the code to automate processes and add efficiency to your client's or employer's processes. Along this journey, you will have an opportunity to "dabble" in four different languages - CSharp, VB.Net, Java and Visual Basic for Applications.

Why use the word "dabble?" Because there is no way, in a single program, to squeeze all of the power that can be utilized in any programming language. Yes, you will see four languages and two platforms, but they all create the same functioning program. As with all software, web sites and anything else you may develop, it is never totally complete. As you are working through the programs, or after finishing them, you are likely to see improvements that could have been made. That is the beauty of computer programming - the only limits are imposed by your imagination and amount of funding.

Read more
5.0
1 total
Loading...

Additional Information

Publisher
Link Em Up, Publishing div
Read more
Published on
Sep 6, 2014
Read more
Pages
89
Read more
Read more
Best For
Read more
Language
English
Read more
Genres
Computers / Programming / General
Computers / Programming / Microsoft
Computers / Programming Languages / General
Computers / Programming Languages / Java
Computers / Programming Languages / Visual BASIC
Read more
Content Protection
This content is DRM protected.
Read more

Reading information

Smartphones and Tablets

Install the Google Play Books app for Android and iPad/iPhone. It syncs automatically with your account and allows you to read online or offline wherever you are.

Laptops and Computers

You can read books purchased on Google Play using your computer's web browser.

eReaders and other devices

To read on e-ink devices like the Sony eReader or Barnes & Noble Nook, you'll need to download a file and transfer it to your device. Please follow the detailed Help center instructions to transfer the files to supported eReaders.
©2018 GoogleSite Terms of ServicePrivacyDevelopersArtistsAbout Google|Location: United StatesLanguage: English (United States)
By purchasing this item, you are transacting with Google Payments and agreeing to the Google Payments Terms of Service and Privacy Notice.