Friday, August 29, 2008

Microsoft Access Graphical User Interface Design Tips

Microsoft Access Graphical User Interface Design Tips

Many people rush into designing the User Interface for their Access database before thinking about layouts, formats and usability. The following tips from David McQueen may greatly assist when considering all of the options available:

This is my primary role in designing Access applications, and having been drafted in plenty of times to "have a look" at some of the forms and reports that have been designed by others, I think that I am able to hand a bit of advice to others.
  1. Users love forms designed in Tahoma 8. I know this old hat to many developers but I am amazed by the amount of screens I come across with a combo of Times New Roman and Bauhaus.
  2. Design reports with Title, Date, UserName and Printed Time in the Header preferably.
  3. Think of the dumbest user ( I know that sounds insulting, but...) possible and ensure that even they can easily read what is on the screen.
  4. Ensure that the design pattern is consistent across the software. For example if you double click to open an item on one list ensure that you can do the same on another form.
  5. Use colour sparingly. Stick to the gray/blue backgrounds and limit the extensive use of colour. If there are items such as required fields on a form, use one background colour for all of these fields, e.g. yellow
  6. Ensure a consistent easy navigation between screens/modules. I prefer to have a consistent top toolbar wherever the user is in the software, or if users don't like that to have a consistent sidebar/treeview.
  7. Type error trapping/required input fields and match them to a dialog box. Nothing worse than a user seeing "Error 76a..blah, blah" when a standard procedure has gone belly up.
  8. Keep screens as uncluttered as possible and please, please, please.....ALIGN the controls on the form.
  9. For objects that are not applicable to a certain user or process, just gray them out/disable them, rather than deleting them. It helps especially when users have made a mental image of what the screen should look like.
  10. Ensure that data in fields is justified appropriately. Right for numbers (integers) and Left for strings.

Thanks to David for the above tips.

To see some examples of the designs being used for a User Interface please go to the User Interface Designs page.

Designing Good Relational Databases

Designing Good Relational Databases:

Databases have a reputation for being difficult to construct and hard to maintain. The power of modern database software makes it possible to create a database with a few mouse-clicks. The databases created this way, however, are typically the databases that are hard to maintain and difficult to work with because they are designed poorly. Modern software makes it easy to construct a database, but doesn't help much with the design aspect of database creation.

Database design has nothing to do with using computers. It has everything to do with research and planning. The design process should be completely independent of software choices. The basic elements of the design process are:

  1. Defining the problem or objective
  2. Researching the current database
  3. Designing the data structures
  4. Constructing database relationships
  5. Implementing rules and constraints
  6. Creating database views and reports
  7. Implementing the design

Notice that implementing the database design in software is the final step. All of the preceding steps are completely independent of any software or other implementation concerns.

Defining the problem or objective.

The most important step in database design is the first one: defining the problem the database will address or the objective of the database. It is important however, to draw a distinction between:
  • How the database will be used and
  • What information needs to be stored in it.

The first step of database design is to clearly delineate the nature of the data that needs to be stored, not the questions that will be asked to turn that data into information.

This may sound a little contradictory at first, since the purpose of a database is to provide the appropriate information to answer questions. However, the problem with designing databases to answer specific or targeted questions is that invariably questions are left out, change over time, or even become superseded by other questions. Once this happens, a database designed solely to answer the original questions becomes useless. In contrast, if the database is designed by collecting all of the information that an individual or organization uses to address a particular problem or objective, the information to answer any question involving that problem or objective can theoretically be addressed.

Researching the current database.

In most database design situations, there is some sort of database already in existence. That database may be Post-it notes, paper order forms, a spreadsheet of sales data, a word processor file of names and addresses, or a full-fledged digital database (possibly in an outdated software package or older legacy system). Regardless of its format, it provides one essential piece of information: the data that the organization currently finds useful. This is an excellent starting point for determining the essential data structure of the database. The existing database information can also provide the nucleus for the content of the new database.

Designing the data structures.

A database is essentially a collection of data tables, so the next step in the design process is to identify and describe those data structures. Each table in a database should represent some distinct subject or physical object, so it seems reasonable to simply analyse the subjects or physical objects relevant to the purpose of the database, then arrive at a list of tables.

This can work successfully, but it's a much better to objectively analyse the actual fields that you have identified as essential in your research and see what logical groupings arise. In many cases, structures that seemed distinct are really reflections of the same underlying subject. In other cases, the complete opposite is true. And to complicate matters, organizations can use the same terms to describe data that they use or collect in fundamentally different ways.

Once the tables have been determined and fields have been assigned to each, the next step is to develop the specifications for each field. The perfect field should be atomic: It should be unique in all tables in the database (unless it is used as a key) and contain a single value, and it should not be possible to break it into smaller components. This is also an appropriate time to start thinking about the kind of data that goes in each field. This information should be fairly clear from the research phase of the project, but sometimes questions remain. Some advance planning can be done to make it easier to implement the database in the software at a later time, such as identifying the type of fields and examining (or re-examining) existing data that you've collected to make sure that the data always fits the model you are constructing. It's much easier and cheaper to fix that now than wait until the database is being rolled out!

Constructing database relationships.

Once the data structures are in place, the next step is to establish the relationships between the databases. First you must ensure that each table has a unique key that can identify the individual records in each table. Any field in the database that contains unique values is an acceptable field to use as a key. However, it is a much better practice to add an arbitrary field to each table that contains a meaningless, but unique value. This value is typically an integer that is assigned to each record as it is entered and never again repeated. This ensures that each entered record will have a unique key.

Implementing rules and constraints.

In this step, the fields in the database are still fairly amorphous. Defining the fields as text or numeric and getting a rough feel for the types of data that the client needs to store has narrowed them down, but there is room for further refinement. Rules and constraints typically lead to cleaner data entry and thus better information when using the data. Business rules and constraints limit the format that data can take or the ways that data tables can be related to other data tables.

Some of these constraints are imposed by the nature of the data itself; social security numbers are always in the same nine-digit format. This type of constraint is normally implemented to make sure that data is complete and accurate. In other cases, the situation itself explicitly constrains the data. The possible values for the data are usually checked against a list or the choice of values is otherwise constrained. This type of constraint is usually easy to implement and easy to change.

Creating database views and reports.

Now that the data design is essentially complete, the penultimate step is to create the specifications that help turn the data into useful information in the form of a report or view of the data. Views are simply collections of the data available in the database combined and made accessible in one place. It could be as simple as a subset of an existing data table or as complicated as a collection of multiple tables joined on particular set of criteria. Reports on the other hand, are typically snapshots of the database at a particular point in time.

Implementing the design in software.

All of the work to this point has been accomplished without explicitly worrying about the details of the program being used to produce the database. In fact, the design should only exist as diagrams and notes on paper. This is especially important at a later point when you or someone else need to update the database or port it to another package. Now it's time to boot the computer and get started.

Relational Database Design Requirements

Relational Database Design Requirements

A database is a collection of information related to a particular subject or purpose (e.g. tracking customer orders or maintaining a music collection). If your database isn't stored on a computer, or only parts of it are, you may be tracking information from a variety of sources that you have to coordinate and organize yourself.

NOTE: It doesn’t matter how simple the system is, a flat-file database design (only one table) should not be employed, because it has severe limitations and inefficiencies. Your database should contain at least two tables and set relationship between them.

A well-designed database is a prerequisite for fast data retrieval and updates.

Before you build the tables and other objects that will make up your system, it is important to take time to design it. A good design is the keystone to creating a system that does what you want it to do effectively, accurately, and efficiently. If you are looking for professional database design, a service such as Approved Database Developers is great online resource.

The basic steps in designing a database system:

  • To determine the purpose of your system.
  • To determine the tables you need in the system.
  • To determine the fields you need in the tables.
  • To identify fields with unique values.
  • To determine the relationships between tables.
  • To refine the design.
  • To add data (populate tables) and create other system objects.

To determine the purpose of your system you need to know what information you want from the database (detailed scenario). From that, you can determine what subjects you need to store facts about (the tables) and what facts you need to store about each subject (the fields in the tables).

So, make sure that you do answer the following questions:

  1. What types of things should the system keep track off? (System objects -> purpose and use)
  2. What would a user want to know about these things and what restrictions should be placed on changing the attributes of these things? (System properties -> features, attributes, nature)
  3. What would a user want to do to these things? (System methods -> actions to be taken)
  4. What is the relationship between the different (things) types? (System Object Model)

NOTES:

  • Modelling the structures on the paper before opening computer and start coding is highly recommendable. Planning may seem time-consuming up front, but not planning is twice as time-consuming later in the application's life cycle.
  • At this point it's important to establish and employ naming conventions. Each table and table field should have short but meaningful and descriptive name according to its purpose and function, and incorporated prefix in front of the name to indicate its type. Although you are allowed to include spaces in the table and field names, most examples in the Microsoft documentation show field and control names without spaces because spaces in names can produce naming conflicts in Visual Basic for Applications in some circumstances.

To determine the tables can be the trickiest step in the database design process. That is because the results you want from your database (e.g. the reports you want to print, the forms you want to use, the questions you want answered) don't necessarily provide clues about the structure of the tables that produce them. In fact, it may be better to sketch out and rework your design on paper first. When you design your tables, divide up pieces of information by keeping following fundamental design principles in mind:

  1. A table should not contain duplicate information, and information should not be duplicated between tables (e.g. Store each customer address and phone number once, in one table).
  2. When each piece of information is stored in only one table, you update it in one place. This is more efficient, and also eliminates the possibility of duplicate entries that contain different information.
  3. Each table should contain information about one subject. When each table contains facts about only one subject, you can maintain information about each subject independently from other subjects (e.g. You would store customer addresses in a different table from the customers' orders, so that you could delete one order and still maintain the customer information).

There are several things you can do to optimise your tables:

  • Design tables without redundant data.
  • Choose appropriate data types for fields; you can save space in your database and improve join operations.
  • Create indexes for fields you sort, join, or set criteria. This will make dramatic improvements in the speed of queries by indexing fields on both sides of joins, or by creating a relationship between those fields and indexing any field used to set criteria for the query. Finding records through the Find dialog box is also much faster when searching an indexed field.

NOTE: Indexes aren't appropriate in all cases. However, indexes add to the size of the .mdb file, reduce concurrency (the ability of more than one user to modify a page at the same time) in multi-user applications, and decrease performance when you update data in fields that are indexed, or when you add or delete records. It's a good idea to experiment to determine which fields should be indexed. Adding an index may speed up a query one second, but slow down adding a row of data by two seconds and cause locking problems. Or it may add negligible gains depending on which other fields are indexed (e.g. Adding an index to a PostalCode field may provide very little performance gain if a CompanyName field and LastName field in the table are already indexed). Regardless of the types of queries you create, you should only index fields that have mostly unique values. In a multiple-field index, use only as many fields in the index as necessary.

To determine the fields you need - Each table contains information about the same subject, and each field in a table contains individual facts about the table's subject (e.g. A customer table may include company name, address, city, state, and phone number fields). When sketching out the fields for each table, keep following tips in mind:

  1. Relate each field directly to the subject of the table.
  2. Don't include derived or calculated data (data that is the result of an expression).
  3. Include all the information you need.
  4. Store information in its smallest logical parts (e.g. First Name and Last Name, rather than Name.)

To identify fields with unique values - In order to connect information stored in separate tables (e.g. To connect a customer with all the customer's orders) each table in your database must include a field or set of fields that uniquely identifies each individual record in the table. Such a field or set of fields is called a primary key.

NOTE: The power of a relational database system comes from its ability to quickly find and bring together information stored in separate tables using queries, forms, and reports. In order to do this, each table should include a field or set of fields that uniquely identify each record stored in the table. This information is called the primary key of the table. Once you designate a primary key for a table, to ensure uniqueness, the system will prevent any duplicate or Null values from being entered in the primary key fields. To set relationships between tables, you must establish a link between fields that contain common information. The fields themselves do not need to have the same name (although it is a good practice), but the field’s data type and length must be the same, and (more importantly) the information contained within both fields for a specific record must be the same in both tables for the link to work. Generally, a relationship is established by linking these key fields between tables – the primary key in one table and a foreign key in another table. Every table should have a primary key – one or more fields whose contents are unique to each record. This is called entity integrity in the database management. By having a different primary key in each record you can tell two records apart. The goal of setting primary keys is to create individual records in a table that will guarantee uniqueness. Database usually has more than one table, and these tend to be related in some manner. When linking tables you link the primary key field from one table to a field in the second table that has the same structure and data type. The link field in the second table is known as a foreign key field. By matching the values from the primary key to foreign key in both tables, you can relate two records.

There are three kinds of primary keys that can be defined:

  1. AutoNumber primary keys - An AutoNumber field can be set to automatically enter a sequential number as each record is added to the table. Designating such a field as the primary key for a table is the simplest way to create a primary key. If you don't set a primary key before saving a newly created table, the system will ask if you want it to create a primary key for you. If you answer ‘Yes’, the system will create an AutoNumber primary key.
  2. Single-field primary keys - If you have a field that contains unique values such as unique serial numbers or part numbers, you can designate that field as the primary key. If the field you select as primary key does have duplicate or Null values, the system won't set the primary key. You can run a Find Duplicates query to determine which records contain duplicate data. If you can't readily eliminate duplicate entries by editing your data, you can either add an AutoNumber field and set it as the primary key or define a multiple-field primary key.
  3. Multiple-field primary keys - In situations where you can't guarantee the uniqueness of any single field, you may be able to designate two or more fields as the primary key. The most common situation where this arises is in the table used to relate two other tables in a many-to-many relationship

IMPORTANT: The process of designing a relational database includes making sure fields containing only one item of data, eliminating redundant data, and making sure a table contains only data directly related to the primary key. The task of a database designer is to structure the data in a way that eliminates unnecessary duplication and provides a rapid search path to all necessary information. The process of refining tables, keys, columns, and relationships to create an efficient database is called normalization (minimizing the duplication of information in a relational database through effective table design).

Normalizing is not just for relational files: it's also a common design activity for indexed files. Each step is called a form, and forms range from the first normal form (1NF) to fifth normal form (5NF). There is also one higher level, called domain key normal form (DK/NF).

Normalization is a complex process with many specific rules and different levels intensity. In its full definition, normalization is the process of discarding repeating groups, minimizing redundancy, eliminating composite keys for partial dependency, and separating non-key attributes. In simple terms, the rules for normalization can be summed up in a single phrase: "Each attribute (column) must be a fact about the key, the whole key, and nothing but the key." Said another way, each table should describe only one type of entity (such as a person, place, customer order, or product item). Some normalization benefits are:

  • Data integrity (because there is no redundant, neglected data).
  • Optimised queries (because normalized tables produce rapid, efficient joins).
  • Faster index creation and sorting (because the tables have fewer columns).
  • Faster UPDATE performance (because there are fewer indexes per table).
  • Improved concurrency resolution (because table locks will affect less data).

You can normalize most simple databases by following a simple rule of thumb: tables that contain repeated information should be divided into separate tables to eliminate the duplication.

To determine the relationships between tables - Now that you've divided your information into tables and identified primary key fields, you need a way to tell the system how to bring related information back together again in meaningful ways.

To do this, you define relationships between tables. Relationship is an association between common fields (columns) in two tables. A relationship works by matching data in key fields. In most cases, these matching fields are the primary key from one table, which provides a unique identifier for each record, and a foreign key in the other table.

The kind of relationship that the system creates depends on how the related fields are defined. When you physically join two tables by connecting fields with like information, you create a relationship that is recognized by Access. The specified relationship is important. It tells Access how to find and display information from fields in two or more tables. The program needs to know whether to look for only one record in a table or to look for several records on the basis of the relationship.

Relationships between tables are grouped into 4 groups:

  1. One-to-one (1:1) - each record in Table A can have only one matching record in Table B, and each record in Table B can have only one matching record in Table A. This type of relationship is not common and it is very rarely used in database systems, but it can be very useful way to link two tables together. The information related in this way could be in one table. However, you might use a one-to-one relationship to divide a table with many fields in order to isolate part of a table for security reasons, or to store information that applies only to a subset of the main table, or for efficient use of space. A one-to-one relationship is created if both of the related fields are primary keys or have unique indexes.
  2. One-to-many (1:M) - is the most common type of relationship and it is used to relate one record in a table with many records in another table. In a one-to-many relationship, a record (parent) in Table A can have many matching records (children) in Table B, but a record (child) in Table B has only one matching record (parent) in Table A. This kind of relationship is created if only one of the related fields is a primary key or has a unique index.
  3. Many-to-one (M:1) - is used to relate many records in a table with only one (single) record in another table. It is often called the lookup table relationship. Normally, this kind of relationship is not based on a primary key field in either table. Although in theory this relationship is one-to-one, it is known as many-to-one because it does not use a primary key field for the link, and many records from the primary table link to a single record in another table.
  4. Many-to-many (M:M) - is used to relate many records in a table with many records in another table. A record (parent) in Table A can have many matching records (children) in Table B, and a record (child) in Table B can have many matching records (parents) in Table A. It is the hardest relationship to understand and it is not correct. By breaking it into two one-to-many relationships and creating a new (junction/link) table to stand between the two existing tables (e.g. whose primary key may consist of two fields - the foreign keys from both tables A and B) will enable correct and appropriate relationship setting. A many-to-many relationship is really two one-to-many relationships with a junction/link table.

In addition to specifying relationships between two tables in a database, you also set up referential integrity rules that will help in maintaining a degree of accuracy between tables. It would prevent unwanted and accidental deletions of records in a parent (primary) table that relate to records in the child table. This type of problem could be catastrophic. These rules keep the relationships between tables intact and unbroken in a relational database management system, because the referential integrity prohibits you from changing existing data in ways that invalidate the links between tables. Referential integrity operates strictly on the basis of the tables’ key fields. It checks each time a key field, whether primary or foreign, is added, changed or deleted. If any of these listed actions creates an invalid relationship between two tables, it is said to violate referential integrity. Referential integrity is a system of rules that Microsoft Access uses to ensure that relationships between records in related tables are valid, and that you don't accidentally delete or incorrectly change related data.

NOTE: When tables are linked together, one table is usually called parent table (always ‘one end’ of an existing relationship) and another table is called child table (always ‘many end’ of an existing relationship). This is known as a parent-child relationship between tables. Referential integrity guarantees that there will not be an orphan, a child record without a parent record.

To refine the design - After designing the tables, fields, and relationships you need the time to study the design and detect any flaws that might remain. It is easier to change your database design now, rather than after you have populated the tables with data or created the relevant screens.

To enter data and create other system objects - When you are satisfied that the table structures meet the design goals described here, then it's time to go ahead and add all your existing data to the tables. You can then create any queries, forms, reports, macros, and modules that you may want.

Additional reading:

Data Normalization - minimizing the duplication of information in a relational database through effective table design (making sure that fields contain only one item of data that is directly related to the primary key, and eliminating redundant data). More »

Referential Integrity Rules - preserving the defined relationships between tables when records are entered or deleted, and ensuring that key values are consistent, in order to prevent users from: adding records to a related table when there is no associated record in the primary table, changing values in a primary table that would result in "orphaned" records in a related table and deleting records from a primary table when there are matching related records. More »

Building Relationships Between Tables - sharing data between tables and/or enforcing referential integrity rules between related tables. More »

Relational Database Design Guidelines

Relational Database Design Guidelines

When designing a Relational Database, there are many design guidelines that you should consider. Below you will find detailed guidelines relating to Primary, Foreign and Candidate Keys and also elements of the ideal field and table characteristics:

Elements of a Candidate Key

  • It can not be a multipart field.
  • It must contain unique values.
  • It cannot contain null values.
  • Its value cannot cause a breach of the organisations security or privacy rules.
  • Its value in not optional in whole or in part.
  • It comprises a minimum number of fields necessary to define its uniqueness.
  • Its values must uniquely and exclusively identify each record in the table.
  • Its value must exclusively identify the value of each record within a given record.
  • Its value can be modified only in rare or extreme cases.

Elements of a Foreign Key

  • It has the same name as the Primary Key from which it was copied
  • It uses a replica of the field specifications for the primary key from which it was copied
  • It draws its values from the primary key to which it refers.

Elements of a Primary Key

  • It can not be a multipart field.
  • It must contain unique values.
  • It cannot contain null values.
  • Its value cannot cause a breach of the organisations security or privacy rules.
  • Its value in not optional in whole or in part.
  • It comprises a minimum number of fields necessary to define its uniqueness.
  • Its values must uniquely and exclusively identify each record in the table.
  • Its value must exclusively identify the value of each record within a given record.
  • Its value can be modified only in rare or extreme cases.

Rules for Establishing a Primary Key

  • Each table must have one, and only one, Primary key
  • Each Primary Key within the database must be unique, no two tables should have the same primary key unless one of them is a subset table.

Elements of the Ideal Field

  • It represents a distinct characteristic of the subject of the table.
  • It contains only a single value.
  • It cannot be deconstructed into smaller components.
  • It does not contain a calculated or concatenated value.
  • It is unique within the entire database structure.
  • It retains the majority of its characteristics when it appears in more than one table.

Elements of the Ideal Table

  • It represents a single subject, which can be an object or event.
  • It has a Primary Key.
  • It does not contain multipart or multivalued fields.
  • It does not contain calculated fields.
  • It does not containunnecessary duplicate fields.
  • It contains only an absolute minimum amount of redundant data.

Optimizing Multi-user Microsoft Access Applications

Optimizing Multi-user Microsoft Access Applications

In the previous article we looked at some of the problems that Microsoft Access multi-user applications have. We also looked at how to prevent and /or solve these problems. In this article we will look at how to make multi-user applications perform better by suggesting ways in which they can be made more efficient. A very common and in some cases unavoidable problem that multi-user applications experience, is that they become very slow. This is mainly due to increasing network traffic as more and more users make use of the same resources.

Most Access multi-user applications involve a database that is accessed simultaneously by different users and applications. The data contained in this database can be described as dynamic meaning that in a networked environment the data does change as users add new records or edit them. How ever, the components and objects of your Access application – forms, reports, queries etc - are static. In majority of cases these database objects are stored in a central database and each time a user accesses this application they receive these static objects together with the dynamic data. Receiving and sending the same static data over the network again and again will of course affect the performance of the application negatively especially as more and more users make use of your application. An obvious solution to this scenario would be to use the network to only send dynamic data and at the same time enable users to load the forms, queries etc. locally.

Microsoft Access provides us with exactly that capability. It has a utility called Database Splitter that as the name implies splits a database into a front end that resides on the user’s desktop and a back end that resides on a single machine and is shared by all users of the application. The utility is available from Access 97 and later, and can be accessed by selecting Tools | Database Utilities | Database Splitter. It is advisable to make a backup of your original database, as a precaution, before splitting it.

The Database Splitter dialog box
Fig 1. The Database Splitter dialog box

Creating the Back-end Database
Fig 2. Creating the Back-end Database

After the database has been split, Microsoft Access will confirm it with a dialog box stating that your database has been successfully split.

Confirmation of Database Split
Fig 3. Confirmation of Database Split

This simply means that you will now have a backend and front end copies of your database. The two copies of your database will be distinct in the sense that all tables will be saved to the backend copy while most other objects such as forms, reports, macros etc will be placed in the front end copy. Note also that the front end copy of your database will not have any tables in it; this is indicated by the little black arrow icon that appears next to the table names in the tables window. They will have links to the tables that is now stored in the backend copy of your database:

Displaying the database Front-End
Fig 4. Displaying the database Front-End

Once the database is split, you can make the front end copy of your database available to as many users as possible and move the backend copy to a central location such as a server. The problem with moving these copies around is that you need to update the link between the back and front ends, so that the links in the front end can locate the tables that are stored in the backend.

So how would you update the links?

Well Microsoft Access provides us with the Linked Table Manager, which as the name implies, manages links between tables. To access the linked table manager, simply right click on the linked table and select Linked Table Manager from the list:

Linked Table Manager menu
Fig 5. Linked Table Manager menu

Once you click on the Linked Table Manager option, you should get the following window:

The Linked table manager
Fig 6. The Linked table manager

The dialog box lists the linked tables in the current database and shows the location of the physical tables to which they point. You can select the links that you'd like to update, and then click the OK button. Access will then display a File Open dialog box that prompts you for the new location of the back-end database file.

Splitting databases as above is a fantastic way of solving the network ‘bottleneck’ problem the only drawbacks is that every time you make a change to any of the objects contained in the front end part of your application you need to redistribute that copy to all your users. If you only have one or two users, this is not a problem; if you have a few hundred users then it just might become one!

Best Practices of ADO.NET

Best Practices of ADO.NET

This article takes a look at the best practices in using ADO.NET, managing transactions, using exceptions efficiently, managing connections and connection pooling efficiently, etc. It also discusses the strategies that can be adopted for optimizing the performance of applications using ADO.NET for data storage and retrieval.

Acquire connection as late as possible, and, close them early. The reason is that when connections are requested for, they are served from the connection pool - a pool of ready to use connections. The MSDN states, "Connection pooling is a powerful functionality that can improve your applications' performance. But if you aren't a good lifeguard, your connection pools can become a detriment instead of a benefit." Connection Pooling gives you an idle, open, re-usable connection instead of opening a new connection every time a connection request to the database is made by the application. The same connection string should be used each time a database operation is made by the application. This will ensure that a new connection is not created each time. Note that we have one Connection Pool per connection string, i.e., as and when the connection string changes, a new connection pool is created for the incoming request.

When you are specifying the connection string, ensure that you specify the IP address of the database server to connect to, rather than the Database Server’s DNS name. This will ensure that there is no need for name resolution, hence minimizing the time required to connect to the Database Server.

When the connection is closed, it is returned back to the pool. There is a limit to the maximum number of open connections, i.e., if the maximum limit of available connections in the connection pool is reached, no more connections will be available henceforth as the pool will not have any more available or ready to use connections to serve an incoming request for a connection by the application. To avoid such scenarios, we should always open or use connections as late as possible and dispose or close them early so that the connection pool always contains a minimum number of ready to use connections to serve the incoming requests. This practice also maximizes the connection pool performance and hence the performance of the application as a whole is also boosted.

As an example, the following code snippet shows how we can use or open connections late, just before we require them:

SqlConnection sqlConnection = new SqlConnection();

try
{
sqlConnection.ConnectionString =
ConfigurationManager.ConnectionStrings["DBConnectionString"].ConnectionString.Trim();
SqlCommand sqlCommand = new SqlCommand();
sqlCommand.CommandText = "Select * from StudentMaster";
sqlConnection.Open();
sqlCommand.Connection = sqlConnection;
SqlDataReader sqlDataReader = sqlCommand.ExecuteReader();

if (sqlDataReader.Read())
{
txtName.Text = sqlDataReader["Name"].ToString();
txtAddress.Text = sqlDataReader["Address"].ToString();
txtPhone.Text = sqlDataReader["Phone"].ToString();
}
}

catch (Exception ex)
{
Response.Write("Error: " + ex.Message);
}

finally
{
if(sqlConnection.State == ConnectionState.Open)
sqlConnection.Close();
}

The above example shows how we can try-catch-finally blocks when performing database operations using ADO.NET. The finally block has been used to dispose or close the connection object, i.e., close the connection and return it back to the connection pool.

In the above code we have opened a connection, retrieved data using a data adapter and populated a data set with that data. Exceptions that can be raised are handled in the try - catch block and the finally block is used to close the connection.

The "using" keyword is an equivalent of try - finally block. The following code snippet illustrates how we can use the "using" keyword to ensure that the objects are disposed before they leave the scope in which they have been created:

using(SqlConnection sqlConnection = new SqlConnection())
{
sqlConnection.ConnectionString =
ConfigurationManager.ConnectionStrings["DBConnectionString"].ConnectionString.Trim();
sqlConnection.Open();

using (SqlDataAdapter sqlDataAdapter =
new SqlDataAdapter("Select * from StudentMaster", sqlConnection))
{
try
{
DataSet dataSet = new DataSet();
sqlDataAdapter.Fill(dataSet);
Repeater1.DataSource = dataSet;
Repeater1.DataBind();
}

catch (Exception ex)
{
Response.Write("Error occured: " + ex);
}
}
}

When you are using the DataSet merely as a container of data, and, not using and of its constraints and other properties, set the EnforceConstraints property of the dataset instance to “false” to turn off the constraints checking and hence improve the application's performance. Also, use the BeginLoadData and EndLoadData methods of the DataTable class before and after you use the Fill method of the DataAdapter to populate data. In doing so, unnecessary index maintenance will be avoided and hence, data processing will be much faster. Here is an example that illustrates how you can implement this in your program:

using(SqlConnection sqlConnection = new SqlConnection())
{
sqlConnection.ConnectionString =
ConfigurationManager.ConnectionStrings["DBConnectionString"].ConnectionString.Trim();
sqlConnection.Open();

using (SqlDataAdapter sqlDataAdapter =
new SqlDataAdapter("Select * from StudentMaster", sqlConnection))
{
try
{
DataSet dataSet = new DataSet();
DataTable dataTable = new DataTable();
dataTable.BeginLoadData();
sqlDataAdapter.Fill(dataTable);
dataTable.EndLoadData();
dataSet.EnforceConstraints = false;
dataSet.Tables.Add(dataTable);
Repeater1.DataSource = dataSet;
Repeater1.DataBind();
}

catch (Exception ex)
{
Response.Write("Error occurred: " + ex);
}
}
}

Note that you should use DataReaders for reading data as much as possible in lieu of DataSets. DataReaders are much faster compared to DataSets and should be a good choice in situations where you only need to read the data from the database and bind it to the controls. If you need to send the data retrieved from the database to another domain, you have only one choice, i.e., the DataSet. The first code example in this article shows how you can use a DataReader to bind data to the data controls in the presentation layer of your application.

When you need to perform CRUD (Create, Read, Update and Delete) operations in a batch, use batched queries for better performance using the UpdateBatchSize property of the DataAdapter. This will reduce the database roundtrips boost the application's performance. At the beginning of any stored procedure that you use, use the following statement to turn off any unnecessary message displays:

SET NOCOUNT ON 

Conclusion

This article has had a look at some of the best practices in using ADO.NET. Happy reading!

Protecting your MySQL database from SQL injection attacks with GreenSQL

Protecting your MySQL database from SQL injection attacks with GreenSQL

SQL injection attacks can allow hackers to execute arbitrary SQL commands on your database through your Web site. To avoid these attacks, every piece of data supplied by a user on a Web form, through HTTP Post or CGI parameters, or other means, must be validated to not contain information that is not expected. GreenSQL is a firewall for SQL -- it sits between your Web site and MySQL database and decides which SQL statements should and should not be executed. At least that's the idea -- in execution, I found some open doors.

The article continues at http://www.linux.com/feature/145341

How to execute dynamic PL/SQL procedure calls

How to execute dynamic PL/SQL procedure calls

Question: (Logan Scott): I am seeking an alternative to an Oracle procedure that calls other Oracle procedures. I realize execute immediate is only for SQL. Is there a way to do this alternative method? In a PL/SQL procedure I have this structure located between a bunch of other commands, which calls other procedures:

if oappid = 1 then
--
tspace1.proc1 (1,2,3);
--
elsif oappid = 2 then
--
tspace2.proc1 (1,2,3);
--
elsif oappid = 3 then
--
tspace3.proc1 (1,2,3);
--
end if;
--


Each of these three procedures are located in other tablespaces (designated tspace1,tspace2, and tspace3). Each time I add a new "application", I must then add another elsif to this structure, for example, I add appid = 4 to run the procedure in tablespace tspace4:

--
if oappid = 1 then
--
tspace1.proc1 (1,2,3);
--
elsif oappid = 2 then
--
tspace2.proc1 (1,2,3);
--
elsif oappid = 3 then
--
tspace3.proc1 (1,2,3);
--
elsif oappid = 4 then
--
tspace4.proc1 (1,2,3);
--
end if;
--


Instead of adding an elsif for every app, I would prefer to use something similar to execute immediate, but instead of executing SQL, I want to execute a procedure in which the command to execute the procedure is a string with a variable used for the tablespace, such as this:

select tblspace into ospacename from applications where appid = oappid;
oprocstring := ospacename || '.proc1 (1,2,3)';
execute immediate (oprocstring);


By using this method, I would not need to modify the master procedure every time I add a new application. This method does not work, but do you know of any other way this can be done?

I may have hundreds of applications, in which I would have to add hundreds of elsifs to this if-endif structure, whereas my proposed alternative would only require these three lines with execute immediate (or something comparable like "execute procedure" that would work for executing other procedures from the master procedure using a variable string:

select tblspace into ospacename from applications where appid = oappid;
oprocstring := ospacename || '.proc1 (1,2,3)';
execute procedure (oprocstring);


Answer: (Dr. Tim Hall): The EXECUTE IMMEDIATE will process SQL or PL/SQL blocks, but what you have sent it is just a procedure call, not a whole PL/SQL block. Going back to your first example:

ospacename := 'tspace2';
--
oprocstring := ospacename || '.proc1 (1,2,3)';
execute immediate (oprocstring);


This should be:

ospacename := 'tspace2';
--
oprocstring := 'BEGIN ospacename || '.proc1 (1,2,3); END;';
execute immediate (oprocstring);


The addition of the BEGIN and END makes this a valid PL/SQL block.

You should really be using bind variables in the EXECUTE IMMEDIATE, so it would be better to write:

ospacename := 'tspace2';
--
oprocstring := 'BEGIN ospacename || '.proc1 (:b1,:b2,:b3); END;';
execute immediate (oprocstring) USING 1, 2, 3;


Notice the USING clause in the EXECUTE IMMEDIATE to bind the values to the bind variables in the string.

Try this,

execute immediate ('begin '||oprocstring||' end;');

In other words, wrap the string with a begin/end and it should work.

declare
vstring varchar2(10);
begin
vstring := 'testproc;';
execute immediate('begin '||vstring||' end;');
end;
/


Storing XML documents in Oracle

Storing XML documents in Oracle

Question: Oracle is offering more than one possibility to save XML into its DB. Can you explain the different ways that you can store an XML document in Oracle?

Answer: Oracle has several procedures and techniques for inserting data, either in-line (inside the tablespace) or offline using the BFILE syntax.

Begin
DBMS_XMLSCHEMA.registerSchema(
schemaURL => 'Mailpiece_Processing/oif_dtypes',
schemaDoc => bfilename('XML_TEST','oif_dtypes.xsd'));
End;

Create table XML_default auf XMLType

insert into xml_default
values
(
XMLTYPE
(
Bfilename ('XML_TEST', 'Test.xml'),
nls_charset_id('AL32UTF8'
));

Also, note that the best benefit of XML-DB is the ability to validate document structure with DTD's, which Oracle has re-named XSD's (XML-DB schema definition):

Rampant author V. J. Jain has a good article on storing XML inside Oracle with XSD definitions.


Question: If I import XML with the BFILE option, is it called CLOB Storage or object relational?

Answer: I would say "both"! The main difference between relational and object-relational is the use of pointers (file location address), so the BFILE function is indeed object-oriented. It's also properly called CLOB storage, but remember that CLOB's can also be stored in the tablespace without pointers.

Using the BFILE construct is outside Oracle, and all Oracles does is maintain a "pointer" to the file location where the CLOB resides. Oracle DOES not manage the data itself, in a traditional relational sense.

So, does this non-relational storage make it "object relational"? I would say "Yes".

In a true object-oriented database, the DBMS is nothing more that a persistence method for allowing object permanence. It's more like a C++ language extension than a DBMS.

What is "object oriented Oracle"? It's the incorporation of the OO constructs such as polymorphism, encapsulation and inheritance, but it also include the "create type" ability where you can define non-traditional storage constructs (nested tables, VARRAY columns, capturing OID's that point to rows).


Oracle test data & date generation with SQL

Oracle test data & date generation with SQL

By Indrajit Agasti
During the application development, quite often, we feel the need to generate a series of dates using SQL or PL/SQL, which can be used in the application.
One way to accomplish the same is to create a table and store the dates.
But maintaining this table with accurate data may require significant effort.
Alternatively, procedural logic in PL/SQL can also be written to generate such series of dates.
In this article we will demonstrate a few methods to generate the row-source without creating any additional table or PL/SQL procedural code and create meaningful series of dates which can be used in the application.
As of this writing, in Oracle, there are at least three ways to generate a row-source without the need to create any additional table using straight SQL.
Option 1:
SELECT 1
FROM DUAL
CONNECT BY LEVEL <=
Where is the number of rows needed.
Option 2:
SELECT 1
FROM DUAL
GROUP BY CUBE ()
The above will generate 2^n (2 to the power n) rows.
Note: In the below article we will use this technique for all demonstration purpose.
Option 3:
SELECT dim
FROM dual
MODEL
DIMENSION BY (0 dim)
MEASURES (0 rnum)
RULES ITERATE () (
rnum [ITERATION_NUMBER] = ITERATION_NUMBER
);
Where is the number of rows needed.
Note: The above clause is available from Oracle 10g and above.
  1. SQL to find all working days in current month:
First we will generate a row-source of sequential numbers using a single SQL on the DUAL table of Oracle. Then we will manipulate the row-source to generate meaningful dates out of it.
We will use the CUBE function of Oracle to generate the rows
From Oracle Documentation: “CUBE takes a specified set of grouping columns and creates subtotals for all of their possible combinations……. If n columns are specified for a CUBE, there will be 2 to the n combinations of subtotals returned.”
In one month, there can be maximum 31 days. The nearest integer > 31 which is a power of 2 is 32.
2^5 (2 to the power 5) is 32.
Hence, from the definition of CUBE above, if we pass five digits to the CUBE function, we should obtain a source of 32 rows.
SELECT 1
FROM Dual
GROUP BY CUBE (2, 2, 2, 2, 2);

Note: In this case, just for the sake of clarity we have passed five 2’s. You can choose any number or characters to the CUBE function to achieve the same result e.g.

SELECT 1 FROM Dual

GROUP BY CUBE (1, 2, 3, 4, 5);

We will wrap the above SQL in an inline view and restrict the row-source to 31 rows using ROWNUM pseudo-column.
SELECT ROWNUM FROM
(
SELECT 1
FROM Dual
GROUP BY CUBE (2, 2, 2, 2, 2)
)
WHERE ROWNUM <= 31;
Thus, the above SQL will give us a sequential series of rows from 1 to 31.
Now, let us use the Date functions to generate the dates.
TRUNC(,’MM’) will give the 1st day of the month.
Hence if we add (ROWNUM-1) value to each row, we will get the series of dates in the current month, starting from the first date of the month.
But every month will not have 31 days. So, for the months having 30 days or 28 days or 29 days, the sequential values added will exceed the last day of current month and will generate rows for next month as well.
If we execute the following query, we will see that the boundary values are exceeded and the dates for the subsequent month are also getting included.
SELECT TRUNC(TO_DATE('20080415','yyyymmdd'),'MM')+ROWNUM -1 DATES FROM
(
SELECT 1
FROM Dual
GROUP BY CUBE (2, 2, 2, 2, 2)
)
WHERE ROWNUM <= 31
We need to introduce a mechanism so that the generated rows can be restricted to the current month only. We will use the ADD_MONTHS built-in function of Oracle to achieve the same.
From Oracle Documentation:
“ADD_MONTHS returns the date date” plus integer” months. The date argument can be a datetime value or any value that can be implicitly converted to DATE. The integer” argument can be an integer or any value that can be implicitly converted to an integer. The return type is always DATE, regardless of the datatype of “date”.
If date” is the last day of the month or if the resulting month has fewer days than the day component of date, then the result is the last day of the resulting month. Otherwise, the result has the same day component as “date”.
With the above usage in our mind, we can easily conclude that if we pass 1 as “integer” to the ADD_MONTHS function, with the “date” as the First Day of the Month, we will obtain the first day of the next month.
From here, the math is easy. The difference between the first day of next month and first day of current month will give us the exact number of days in the current month. Also note that Oracle Date functions (and not exactly our logic) will now take care of the number of days in a month. Even if the year is a leap year, Oracle will determine the appropriate number of days.
The following SQL will clarify the statement.
SELECT ADD_MONTHS(TRUNC(TO_DATE('20080415','YYYYMMDD'),'MM'),1) FIRST_DAY_OF_NEXT_MONTH ,
TRUNC(TO_DATE('20080415','YYYYMMDD'),'MM') FIRST_DAY_OF_CURRENT_MONTH ,
ADD_MONTHS(TRUNC(TO_DATE('20080415','YYYYMMDD'),'MM'),1) - TRUNC(TO_DATE('20080415','YYYYMMDD'),'MM') NO_OF_DAYS_IN_CURRENT_MONTH
FROM DUAL;
We can try the above SQL with any date, even February month of a leap year. The result will always be accurate for any month.
Hence, in order to generate all the dates for the month of April, the query will be as follows:-
SELECT TRUNC(TO_DATE('20080415','yyyymmdd'),'MM')+ROWNUM -1 DATES FROM
(
SELECT 1
FROM Dual
GROUP BY CUBE (2, 2, 2, 2, 2)
)
WHERE ROWNUM <= ADD_MONTHS(TRUNC(TO_DATE('20080415','YYYYMMDD'),'MM'),1) - TRUNC(TO_DATE('20080415','YYYYMMDD'),'MM'); –Here, are restricting the number of rows to that of no of days in the month
From the above SQL, we now have all the dates in the month.
We just need to exclude the Saturday and Sunday to finally get the complete result-set of working days.
We will wrap-up the above query in an inline view and then apply a WHERE clause to filter out the Saturdays and Sundays.
The following will be the final query to generate the Working Days.
SELECT * FROM
(
SELECT TRUNC(TO_DATE('20080415','yyyymmdd'),'MM')+ROWNUM -1 DATES FROM
(
SELECT 1
FROM Dual
GROUP BY CUBE (2, 2, 2, 2, 2)
)
WHERE ROWNUM <= ADD_MONTHS(TRUNC(TO_DATE('20080415','YYYYMMDD'),'MM'),1) - TRUNC(TO_DATE('20080415','YYYYMMDD'),'MM')
)
WHERE TO_CHAR( DATES, 'DY') NOT IN ('SAT','SUN');
  1. SQL to find all working days in current year:
In this case the approach will be same, but there will be a small change in the arithmetic.
A year consists of maximum 366 days. The nearest integer > 366 which is a power of 2 is 512.
2^9 (2 to the power 9) is 512.
Also, we need to modify t he ADD_MONTHS and TRUNC functions to calculate the number of rows for a year. In this case also, any special date handling (e.g. the leap year consideration) will be taken care by Oracle.
SELECT * FROM
(
SELECT TRUNC(TO_DATE('20080415','yyyymmdd'),'Y')+ROWNUM -1 DATES FROM
(
SELECT 1
FROM Dual
GROUP BY CUBE (2, 2, 2, 2, 2, 2, 2, 2, 2)
)
WHERE ROWNUM <= ADD_MONTHS(TRUNC(TO_DATE('20080415','YYYYMMDD'),'Y'),12) - TRUNC(TO_DATE('20080415','YYYYMMDD'),'Y')
)
WHERE TO_CHAR( DATES, 'DY') NOT IN ('SAT','SUN');
Notice the sections in the above query which is coloured blue. To fetch the working days for the current year, that is the only change made to the Query discussed and formulated in Section 1.
  1. SQL to determine the first and last working day of the month for a given date:
We will extend the concept of the query of section 1. The following query will provide the first and last working day of the month for a given date.
SELECT MIN(DATES) FIRST_WORKING_DAY , MAX(DATES) LAST_WORKING_DAY FROM
(
SELECT TRUNC(TO_DATE('20080415','yyyymmdd'),'MM')+ROWNUM -1 DATES FROM
(
SELECT 1
FROM Dual
GROUP BY CUBE (2, 2, 2, 2, 2)
)
WHERE ROWNUM <= ADD_MONTHS(TRUNC(TO_DATE('20080415','YYYYMMDD'),'MM'),1) - TRUNC(TO_DATE('20080415','YYYYMMDD'),'MM')
)
WHERE TO_CHAR( DATES, 'DY') NOT IN ('SAT','SUN');
  1. SQL to determine the first and last working day of each month in the year for a given date:
We will extend the concept of the query in the section 2.
The following query will give us the first and last working days in each month of the year.
SELECT MIN(DATES), MAX(DATES) FROM
(
SELECT TRUNC(TO_DATE('20080415','yyyymmdd'),'Y')+ROWNUM -1 DATES FROM
(
SELECT 1
FROM Dual
GROUP BY CUBE (2, 2, 2, 2, 2, 2, 2, 2, 2)
)
WHERE ROWNUM <= ADD_MONTHS(TRUNC(TO_DATE('20080415','YYYYMMDD'),'Y'),12) - TRUNC(TO_DATE('20080415','YYYYMMDD'),'Y')
)
WHERE TO_CHAR( DATES, 'DY') NOT IN ('SAT','SUN')
GROUP BY TRUNC(DATES,'MM')
ORDER BY MIN(DATES), MAX(DATES)
  1. SQL to generate a series of letters from A to Z:
The approach will be similar to the examples shown above.
We need to generate the numbers from 1 to 127 to represent all the standard ASCII characters.
After the row-source is generated, we will convert them to the ASCII character using the Oracle built in function called “ascii”.
Finally we will filter out the rows for which only the alphabets are present.
The following SQL will generate a series of letters from A to Z in upper and lower case:
SELECT CHR(rnum) Ascii_Character FROM
(
SELECT ROWNUM Rnum FROM
(
SELECT 1 FROM DUAL
GROUP BY CUBE(2,2,2,2,2,2,2)
)
WHERE ROWNUM <= 127
)
WHERE
(
Rnum BETWEEN ASCII('A') AND ASCII('Z')
OR
Rnum BETWEEN ASCII('a') AND ASCII('z')
)

to generate the series of characters from 'A' to 'Z' and 'a' to 'z'.

About the author:

Indrajit Agasti

Indrajit Agasti

Indrajit Agasti is working with Cognizant Technology Solutions Ltd, United Kingdom, as a Project Lead and Oracle Developer. He has 6+ years of experience of application development and has been developing various Oracle based applications for the last 4 years.

E-Mail: indrajit.agasti@yahoo.com