Garbage In, Garbage Out

A famous saying among computer programmers that refers to the fact that computers cannot tell the difference between good and bad data, is "garbage in, garbage out." This saying, Loops abbreviated GIGO, refers to the fact that computers cannot tell the difference between good data and bad data. If a user provides bad data as input to a program, the program processes that bad data, and will produce bad data as output.

An example of bad data might include a user typing in 400 hours in to a payroll program, when asked for the hours worked that week. Humans easily spot that error. We know someone can not work 400 hours in a week since there are only 168 hours in a week. Humans realize that the user accidentally added an extra zero - they obviously meant to enter 40 hours for that week.

The integrity of a program's output is only as good as the integrity of its input. Therefore, we could design programs in a way that bad input is never accepted. Input should be inspected before it is processed. If the input is invalid, the program should discard it and ask the user to enter the correct data. This process is called input validation.

Input validation is the outer defensive perimeter for your application. This perimeter protects the core business logic, processing and output generation. Web applications are particularly vulnerable to attack and it is very important to validate all input from the user. Here is a cheat sheet from the Open Web Application Security Project (OWASP).

We've all encountered a Web form that asked us for our email address. Not wanting to share it, we've probably all just type jumbled characters to fill the box. However, good input validation code probably told us that the email address we typed was not a valid email address. Does the site have a list of all valid email addresses in the world? No! They simply tested to make sure (1) there was no space in our email address, (2) there was one, and only one @ symbol in our email address, and (3) there was at least one dot (or period) in our email address, followed by a valid top level domain such as .com or .net. It probably checked to make sure there were characters before the @ symbol, and a number of other things that would make an email address an invalid email address.

Keep in mind that validation does not guarantee there is not an error. It simply looks for common errors and rejects obvious errors in the data input.