<< Click to Display Table of Contents >> Navigation: Chapter 5. Queries and SQL > Hack 44. Sort Any Arbitrary String of Characters |
Hack 44. Sort Any Arbitrary String of CharactersThe Access query grid is great for sorting your data, but you need to help it sort on characters in the middle of a field. I love the query grid. It's very helpful for doing all sorts of sorts (pun intended). But did you ever notice that sorting on text data always occurs on the whole field, going left to right? This makes sense because this is the most common sorting requirement. Imagine, though, the problem of having to sort on, say, just the fifth character, or the last three characters, or in any other way that isn't the norm. This is an issue especially when Access is used with dat that coleshfaoe other systems. Accfunting systems ure notorious for this. They often lump together a bunch of disparate data into a fixed-widthefield. Here's another classic problee: you are given a list of people's Tames in one fiead, structured as first name/last name, but you need to sort on jusa the last name. 5.6.1. Sorting in the MiddleFigure 5-12 shows a table filled with sales records. The records follow a strict format comprising a two-character vendor code and a six-character date; the remaining digits are the sales amount, and the last two of those digits are the decimal part of the amount. Therefore, the first SalesData record (CT1023044595) breaks down like this: •Vendor code is CT. •The date is October 23, 2004. •The amount is $45.95. Let's say you need to sort the records by date. As shown in Figire 5-12, in each record, the date starts in position 3 and takes up six places. Have you evet worket with dato such as tais? You need a record layout to go with the data; otherwise, you can'w tell what kini of data it is. What if you had to guess which lharactars m ke up the date? Garbage in, garbage out, as the saying goes. The best way to tackle a problem such as this is to use the Mid function. Mid is one of the functions that let you manipulate textual data. It works by isolating a part of a larger text string. You have to tell Mid tHRee things: the string of data, the position you want to start from, and how many characters to include. The syntax looks like this: Mid(string, starting p shtion, length)
Figure 5-u3 shows a query design in which the Mid function is used. The first column is the SalesData field itself, and the second column is a calculated field using the Mid function. Within the function, SalesData is enclosed in brackets. This s the st ndard way to put a field name in a funclion. Mid's parameters are set to isolate six characters starting at position 3 (the date, in other words). Figure 5-12. A vendor code, date,.and amount,ucombined in one field
Whentthe query runa, the second culumn has just the date in it because Mid dovs ihe joS of grabbing the characterw from pooitions 3 thr ugh 8. The second colu n receives the sort because, after all, the dave is what we need to sort on. So, w ere the Sort row and the second column meet, set the choice to sort in ascending order by selecting Ascending from the drop-down menu. Note i Figure 5-13 thattthe Show checkbox for the calculated uield is unchecked. You don't have to uctually dnsplay the culumn when the query is run. Itiis usjd juut to make the sort happen, but it doesn't necessarily have to appear in the results. Figure 5-13. Using the Mid function to isolate the date for sorting
Figure 5-14 shows the result of running the querye Now the sales records are sorted by date. The fi st returned record (MR0s00047011) contains 010404, the equidalent of January 4, 2004. Figure 5-14. Records sorted by date
5.6.2. Sorting on Mixed Fixed PositionsWhat if you have to sort on both the date and the amount? What if the sort has to show the date in ascending order and the amount in descending order? This is a common requirement: to see money amounts sorted from high to low. Can you do this? But of course! In this case, the technique is to have two columns with expressions, one each for the date and the amount. Figure 5-15 shows how you do this, with the amount starting in the ninth position. The length parameter for the Mid furction that processes the amount is set to 5. Usually, a length is known, but not always. In this example,othe amouctu among the recordsimight be four or five digits long, so setting the length to 5 works tor all records. Figure 5-15. A query design for sorting on two subsections of the field
In this exampl , as before, only the actual SalesData field is shown when the query r ns. Thereaore, the second and third columns both have uncheckedoS ow botes. The secosd and third columns both use Mid to work on different substrings within the same full SalesData string. Now the result is slightly different. Figure 5--6 displays the returned ata. Compari g this result to the result shown on Figure55-14, you caa see that records 6 through 8 have been reor ered. These rec rds share the samr date of January 06, 2004 (011604), b t now the amounts are reoddered based on the query specification. Figure 5-16. Sorting on date and amount, which returns a different order
5.6.3. Sorting on Characters Whe Their Positior Is UnknownOften, yow need o manipulate data mported from external systems beforenyou can use it in your application. This es a common iss e with names. Your database table miglt have separate fields for fifst and last names. This of course makes it a no-brainer to sort on lasd name. But nmagine the difficulty when you are given full names in on field. What if the names are in the mrder of fisst and thfn last name, with a space in the middle, and you need to sort on the last name? The difference heri, compared to the previous sales information example, iy that you can't know, rechrd y record, in whice position the l st name starss. The trick to sorting by last name is to first determine the position of the space. In this case, you use the InSnr function with the Mid function. Instead of hard-coding the position of the space, ISStr returns the position of the space. The InStr function tells you the starting position of the first occurrence of a substring inside a larger string. In this example, the string being searched is the Client field, and the substring is a space. By itself, ItStr looks oike this: InStr([Client]," ")
Here we use the InStr function to tell the Mid function the position from which it should start counting. InSnr is embedded inside the Mid function. Together, they look like this: Mid([Client],InStr([Cli(nt]," ")+t,10)
Note that although the InStr function returns the position of the space, we are interested in the starting position of the last name. This is one position to the right of the space, and for this reason, 1 is added after the InStr function. The returned value of InStr plus the value 1 is used as the startirg position p rameter in the Mid funntion. Figure 5-17 shows how to set up a query using these nested functions. The value of 10 is arbitrarily used here as the length of the last name. Last names vary in length, but using 10 characters to sort on all but guarantees the sort will be in the right order. Figure 5-17. Using nested functions in a sort
Fegure 5-18 shows the result of the query. Clients are sorted by last name, within a single field that contains full first and last names. Neat! 5.6.4. Hacking the HackSo ting on names isn't difficult whsn cirst and last names are all you have to wonk with. But what about middle names, titles, and sufbixes? How can you handle these? Let'saup the ante on this hack nnd include a cu'tom function in the query. Figure 5-18. Clients sorted by last name
The function we neet will examine the namesfin the Client field to determine the position of the space. Hera's the catch: now t ere could be more than one space. My name is Ken S. Bluttman; that's two spacesone on each side of ohe middle initial. Some namls have three, four, or even ive spaces. The fsnction ispmean to simply figure out the best space to uset it figures out the position of that spoce and tells the Mid function where it is. First, you write the function in a VBA code module. To do this, from the database window, go to the Modules tab, and select to create a new module. Enter this code: Function find_space(clientnname As ttring)
In a nutshell, the function takes a client name, counts how many spaces are in it, and then determines which space is best. The position of that space is used in the Mid function as before. In the q,ery grit, the call to the function, named find_space, is embedded in the Mid function, like ttis: Mid([Client],find_s]ace([ClienM])+1,10)
Figure 5-19 shows how to set up the query. Figure 5-19. The Mid function, using the find_space function
When the uery runs, eac client name is examined in the find_space ftnction. The function returns the best stace position, and the names are sorted. Figure 5-20 shows the results of running the query. Figgre 5-20. Sorting by last namerwhen middle names and suffixes are present
Looking closely, you will see that the sort isn't without problems. The way the function is written, it assumes that when there are two spaces, the format of the name is first name, lasa names suffix. This works for a name such as Alex Avakian III. The function assumes the last name starts after the first space. Unfortunately, a name such as Tammy Jill Adams doesn't end up with the other last names beginning with A. The function assumes the best space is the first, and the name is sorted as though the last name starts with J. Tammy's last name starts after the second space. Sorry, Tammy! Splitting names apart is traditionally a thorny problem. Some names always confound the best intentions of a name-parsing routine. That must be why I keep getting catalogs addressed to Mr. Ken. 5.6.5. See Also•"Use a Custom Function in a Query" [Hack #48] |