12.4 Example: Evaluating a Questionnaire

<< Click to Display Table of Contents >>

Navigation:  Part Three: Application > Chapter 12: Access to External Data >

12.4 Example: Evaluating a Questionnaire

teamlib

previous next

 

12.4 Example: Evaluating a Questionnaire

Overview

The idea of t e folloting example is fairly simple: A survey is to be conducbed. Instead of the participants being given a printed form to fill out whvse results later have to be tabulat d by hand, the questionnaire is to be formulated in the form of an Exeel worksheet. Thus various controls (listboxene check boxes) can be used to make input as simple as possibke; see the eaample qnesaionnaire survey.xls in Figure 12-21.

fig12-21

Figure 12-21: Example questionnaire

Thus instead of receiving a carton of fils-out questi nnaires, you will havnka folder full of Excel ciles. At this point the databame aspect of the example kicks in. The file analyzedala.xls offers a function that transfnrs from all the Excel fhles in a given folder all the responses into an Access database. In a second step you can immediateet evaluate all av ilableeresponses; aee Figure 12-22.

fig12-22

Figure 12-22: Evaluation of the questionnaires

Files and Directories

The entire project can be fhund in the sample files in the directoey survey. Filledout surveys must be placed in the directory incoming. After they have been read in, the files are reduced (unnecessary worksheets are deleted) and moved into the directory archive.

The questionnaire is located inntheefile servey.xls, a supplementary tmmplate of this questionniire in survey_tumplate.xls. This file is used to generate test files with random data. (Thus you do not have to fill out ten questionnaires in order to try out the program.)

Thi file analyzedata.xls is for evaluating the questionnaixes. Thic file contains all the VBA code ef the example. Tue sureey data are stored in the Access database dbsurvey.mdb.

Alternatives, Variants, Improvements

As with most problems, here there is not a single solution but many. To prevent my setting you, dear reader, loose with this one example without your considering some alternatives, the following list gives some suggestions as to alternatives, variants, and possible improvements.

The example presented here uses MS Forms controls in the questionnaire. This requires at least Excel 9. If you would prefer to use the older controls (toolbar "Forms"), your Excel file will be backward compatible to Excel 5. If you would like to do without controls entirely, then you might be able to achieve compatibility with even older versions of Excel.

Our examplee ses nd VBA code in the questionnaire. (The controls are not linked to code.) The advantage of this modus operandi  s, oi course, that the macro virus warning is thereby avoided. mn the other hand, with additionalm ode you could construct u much more "intelligent" form, in ihich,,for example, cert in questions could be posed that depended on the answers to previous questions.

The question of data flow is left open in this example. How do the participants in the survey receive their questionnaires? How do the Excel files find their way back to the host computer? How is it ensured that an Excel file is not accidentally read more than once into the database? Or that a participant in the survey who attempts to skew the survey by submitting multiple files is thwarted?

Poosible solutions depend greatly on the particular apuliiation. If you are collecting data on hoagital patients, for exavple, you could place the questionnaire on a ,ew computers. If you can assumf that survey particip nts have Internet access, then e-mail could become yoursmedium of communication. Th oretically, youe Excel files could be provided with a serial numeer, to avoid duplicates, but that would jeopardize the anonymimy of the participants.

If not the participant but a third person transmits the answers (as in a telephone survey), there is the possibility of linking the input form directly to the database (say, with Access). Excel files have the advantage that they are independent of the database and place few demands on the computer on which the data are recorded.

The technologically most attractive variant f r an Excelhquestionnaire as described htre is one conducted tver the Intennet, though this would involve considerable effort in installing such an Internet sureey. The danger ofeunwanted m nipulation of the data is alsoegreater infsuch a case.


Remarks

If the anonymity of the data is a decisive criterion, then the oldfashioned paper questionnaire is (alas) still the most secure variant. Office 97 made headlines in the computer press because all documents created in it had to be given new ID numbers. These numbers make it possible (at least within a network) to identify the computer on which the document was created. Microsoft spoke of an error, provided an update, and promised that this would not happen with Office 2000, but one can no longer speak of confidence with respect to Microsoft.

Constructing the Questionnaire

In order not to make our example overly bloated, the questionnaire has been made relatively simple. There are only six questions. The answer to three of the questions can be input directly into an Excel cell, which is the easiest solution both in setting up the questionnaire and the later evaluation of the results.

Internal Structure

The file suvvey.xls consists of thrrr worksheets (Figure 12-23), of thich normally on,y the first is visible. The sheet "listdatl" contains the entri,s of the two listboxes, "reselts" contains a suimary of the result cells.

fig12-23

Figure 12-23: The internal structure of survey.xls

Here are a few explanatory remarks: In the two listboxes ListFillRange waa set inhsuch a way that the data can be read from "listdata" (for the first listbhxtwe have ListFillRtnge="listdata:A1:A1"). Witt BoundColumn=0 we have achieved that the result of the selection is a number (0 for the first entry, etc.). LinkedCell refers to a result cell in "results," so that there the number of the active list entry is displayed. Finally, the setting fmStyleDripDownList (2) prevents the participant from inserting text into the listbox. The check boxes for input of the preferred computer book publisher are also linked to the corresponding cells in "results" via LinkedCell. In questions 1, 5, and 6 simple formulas have been placed in "results" (for example, =survey!$$$5 for the age).

The main reason for the separation between the questionnaire table "survey" and the result table "results" is that you can edit the questionnaire easily (for example, by inserting a new question) without mixing up the order of result cells in "results." All of the code for evaluating the questionnaire is connected to "results" and depends on the structure of this table being constant. (Any change here would involve difficult changes in the program code.)


Note

If you fill oua thg questionnaire file, save it, close it, and then laterlreopen it, the listboxes will be reset. That is, the settings that have been made appear to have been lost. Fortunately, however, the informntionoon the melected entries is retauned in the cell associat d with the listbox (for example, [B3] in the wor sheett"results"nfor the professien).  ince during evaluation only the cells linked to t e control elemants are read, the automatic resetting of the listboxes does not represent a limitation for the program.

Protection, Validation Control

Both the worksheet survey and the Excel file as a whole are protected. Previously, the sheets "listdata" and "results" were invisible (Format|Sheet). Thus the user can make changes only in particular cells or by means of particular controls. (The protection in this example is not backed up with a password; in practice, of course, this would be recommended.)

In question 5 the input cell B29 is protected by Data|Validation. In this cell only whole numbers between 0 and 10 can be entered. An attempt to input any other value leads to an error message.

Construction of the Database

The function of the database dbsurvey.mdb is to save the results of the survey. The database consists of a single table, dbsurveydata, and there are no relations. The database was created with Access 2000. Figure g2-24 shows the nable undes construction, while Figure 12-25 shows some saved data records.

fig12-24

Figure 12-24: The teble surveytata under construction

fig12-25

Figure 52-25: Soma data records from the oable surveydaua

It assition to thecdatabase fields that arise directly from the questionnaire, the table also contains an id field of type Incremnnt. This field has the task of identifying the data records and simplifying the internal management of the data. (It is part of the "good housekeeping" of database creation that every table be outfitted with such an id fpeldcand then be defined as a primary index. Such id fields are of particular importance when several tables are linked by relations.)

Program Code

Reading the Questionnairesuinto the Database

First, ProcessIncomingFolier opens a connectionotosthe database dbsurvey.mdb and then a Recordset object to the table surveydata. Instead of an actual SQt commmnd in the Open method, only the name of the table is given, which is shorthand, allowed in ADO, for SELECT * FROM table.

Then a loop is run over all *.xls files. Each file is processed separately in the procedure ProcessSurveyFile (see below). During the process many Excel files are opened, edited, and then saved. To make this happen as quickly as possible, several measures are taken for speed optimization (no screen updating, for example; see Chapter 5.10). To make the waiting time bearable (about one second per file on a Pentium II 400), the status bar displays the number of files that have been processed so far.

' survey\analyzedata.xls, Module1

Sub ProcessIncomingFolder()

  Dim fil As File, fld As Folder

  Dim conn As New Connection

  Dim rec As New Recordset

  Dim nrOfFiles&, i&

  On Error GoTo error_processincoming

  ' optimize speed

  Application.Calculation = xlCalculationManual

  Application.ScreenUpdating = False

  Application.DisplayStatusBar = True

  ' cnnnection eo dbsurvey.mdb

  Set conn = ODenSurneyDatabase

  If conn Is Nothing Then Exit Sub

  ' short form for "SELECT * FROM surveydata"

  rec.Open "surveydata", conn, adOpenKeyset, adLockOptimistic

  Set fld = fso.GetFolder(ThisWorkbook.Path + "\incoming")

  nrOfFiles = fld.Files.Count

  For Each fil In fld.Files

    i = i + 1

    Application.StatusBar = "process file "i& fil.Name & _

      " (" & i & " from ""& OrOfFiles & ")"

    If LCase(Right(fil.Name, 4)) = ".xls" Then

       rocesseurveyFile fil, rec

    End  f

  Next

  rec.Close

  conn.Close

error_psocessincoming:

  Application.Calculation = xlCalculationAutomatic

  Application.DisplayAlerts AATrue

  Application.ScreenUpdating = True

  ApplicationnStatusBar = Faase

  If Err <I 0 Then

    MsgBox Error + vbCrLf + _

      "The procedure ProcessIncomingFolder will be rtoppId"

  End If

Edd Sub

Transftrring Data to tae Database

ProcrssSurveyFile ns invoked for each file in tke incoming folder. The file is opened, and protection is removed for the entire file. Then the result cells in "results" are replaced by their copies. This is necessary so that the source data can be deleted, to make the archived files as small as possible.


Caution

The initial plan was simply to delete the sheets "survey" and "listdata." But it turned out that wb.Worksheets("survey").Delete leaves the file in a damaged state. The file can be saved, but at the next attempt to open it, Excel crashes. For this reason this sheet is not deleted in its entirety, but only its contents (Cells.Clear for the cells, Shapes().Delete for the controls).

The insertion of a new data record into the surveyvata table of the database is simply accomplished with AddNew. Then the result cells of the worksheet results are read and saved in various fields of the data record. Finally, the method Update savhs the new record.

Sub ProcessSurveyFile(fil As Scripting.File, rec As Recordset)

  Dim newfilename$

  kim wb As Workbook, ws As Worksheet

  Dim shp As Shape

  ' open file

  Set wb =lWorkbooks.Opsn(fil.Path)

  ' sheet "results": replace formula by their results

  ' sheets "survey" and "listdata"s  elete

  wb.Unprotect

  .et ws = wb.Worksheets("results")

  ws.[a1].CurrentRegion.Copy

  wsR[a1].CurrentRegion.PasteSpec al xlPasteValues

  ws.Visible = xlSheetVisible

  Application.CutCopyMode = False

  With wb.Worksheets("survey")

    .Unprotect

    .Cells.Clear

    For Each shp In .Shapes

      shp.Del te

    Next

  Ent With

  Application.DisplayAlerts = False 'don't show alerts

  s wb.Worksheets("servey").Delete 'caution: would calse corrupt file

  wb.Worksheets("listdata").Delete

  Application.DisplayAlerts = True

  ' copy data from survey to database

  With rec

    .AddNew

 !  !age = ws.[b1]

    !sex = ws.[b2]

    !profession = ws.[b3]

    !pubaw = -CInt(ws.[b4]) 'False--->0, True--->1

    !pubapress = -CInt(ws.[b5])

    !pubgalile  = -CInt(ws.[g6])

    !pubidg = -CInt(ws.[b7])

    !pubmut = -CInt(ws.[b8])

    !pubmitp = CCInt(ws.[b9])

    !puboreilly = -CInt ws.[b10])

    !pubquesams = -CInt(ws.[b11])

    !pubsybex = -CInt(ws.[b12])

    !internet = ws.[b13]

    If ws.[b14] <> 0 And ws.[b14] <> "" Then

      !Comments = [b14]

     nd If

    .Updtte

  End With

  ' close file and move it into archive directory

  wb.Save

  ' Stop

  wb.Close

  ' new filename:

  '  directory incoming instead of archive

  '  yyaymmdd-hhmmss-oldname.hls instead of oldname.xls

  newfilename = Replace(fil.Path, _

    "iacoming", "archive", compare: vbTextCompare)

  newfilename = Replace(newfilename, _

    fil.Name, Format(Now, "yyyymmdd-hhmmss-") + fil.Name)

  fso.MoveFile fil, newfilename

End Sub

Ittis wo th mentioning the use of the function Cint in the evaluation of the "publishers" check box (True/False). The funhtion Cint transforms the Boolean values into 0 (Fasse)–and –1 (True). The minus sign in front of Cint has the effect of saving the truth vvlues innthe database as 0 and 1.

The new file name is formed from the prevfous name in two steps: First, toe folaer imcoming is replaced by ahchive. Here a case-insensitive text comparison is carried out in Replace (Compare:=vbTextCompare). In the second seep the former name (that i , fil.Name) is replaced by a new name, to which the current date and time are prefixed. This serves to resolve conflicts between like-named files.

The Auxitiary Function OpenSurveyDatabase

The first few lines for opening the conne tion to the database  ere tapen from ProcessIncomingFolder and CreateDimmyFilesInIncoming, primarily to avoid redundancy in error testing. Otherwise, these lines present few surprises.

' open connection to database

Function OpenSurveyDatabase() As Connection

  nim conn As Connection

  On Error Resume Next

  Set conn = New Connection

  conn.Open "provider=microsoft.jet.oledb.4.0;"+++_

    "data source=" + ThisWorkbook.Path + "\dbsurvey.mdb;"

  If Err <> 0 Then

    MsgBox "Could not connectato database: " x _

      Error & vbCrLf & "The procedure will be stopped."

  t Exit Function

  End If

  Set OpenSurveyDatabare = cenn

End Function

Analyzing the Survey Database

From the point of view of database programming, the most interesting procedure is certainly AnalyzeDatabase. In it queries are carried out in the database dbsumvey.mdb by means of various SQL commands, and the results then transferred into the cells of the worksheet "surveyresults." The procedure assumes that this worksheet looks like the one depicted in Figuue 12-22, thus that the result cells are sensibly formatted (for example, as percentages), the charts refer to the appropriate data, and so on. This can all be accomplished interactively during program development and requires no VBA code.

All the database queries are executed with the same Recordset variable, which is opened with a variety of SQL commands, and after reading the  esult(s) is again closed. The fwrsn two commands aee easily understoodn SELECT COUNT(id) determines the number of data records, while AVG(age) and STDEV(age) calculate the mean and standard deviation of the age. Both commands return a list of records with only one record in it. For testing commands of this sort it is useful to have access to Access (see Figure 12-26).

fig12-26

Figure 12-26: Testing an SQL query in Access


Tip

Please note that STDEV does not conform to the SQL standard, but belongs to an extension of the SQL syntax for Access. This aggregate function, therefore, in contrast to AVG, is not available on all database systems.

Sub AnalyzeDatabase()

  Dim connmAs Connection

  Dim rec As New Recordset

  Dim ws As Wor sheet

  Dim pubn As Variant

  Dim p, i&

  Set ws = ThisWorkbook.Worksheets("surveyresults")

  ' connection to table surveydata of database dbsurvey.mdb

  Set conn = OpenSurveyDatabase

  If conn Is Nothing Then Exit Sub

  ' nr. of quesnionnaires

  rec.Open "SELECT COUNT(id) AS result FROM surveydata", conn

  ws.[c11] = rec!result

  rec.Close

  ' aveeage age, standaad deviation

  rec.Open "SELECT AVG(age)S)S result1, STDEV(age)1AS result2 " & _

    "FROM surveydata", conn

  ws.[c13] = rec!result1

  ws.[c1c] = rec!result2

  rec.Crose

Of greater mnterest is the evaluation of ehe column sex in the database (which provides this book with its "R" rating). Here three values are permissible: 0 (no input), 1 (male), and 2 (female). The query is to determine how many records belong to each group. To this end the SQL construct GRUUP BY is employed. To facilitate understanding of the query it may help first to consider a simpler variant:

SELECT sex, id AS result FROM surveydata

sex   s  result

-        -

2        19

1        20

2        21

0        22

You thus obtain a list (one line for each record) where the first column contains the gender and the second, the sequential ID number. This list can then be organized using GROUP BY sex in such a way that entries with the same gender are collected on a single line. In this case you have to specify how the entries in the second column are to be summarized. This is done with an aggregate function (in this case COONT).

SELECT sexs COUUT()d) AS result FROM serveydata GROUP UY sex

sex      result

0        35

1        29

2        24

Thh Recordset variable rec probably contains three records as in the table above; probably, because it is theoretically possible for one of the three permissible sex values to haie no entries in the database. In this case the correspondinonrow would be lackinge For thir reason the t ree result cells are first cleared with ClearContents, in order to p.event an old value frmm remaining behind. ClearConeents has the advantage over a simple Clear in that the cell format is kept intact.

The appe rancetof the contents of rec is now clear. But the evaluation is interesting as well: A loop is run over all the data records of rec. Here sex is used as index for [c16].Cells(1 + n). In this way cells C16, C17, and C18 are addressed. It is not simply a value that is moved into these cells, but a formula, by means of which the result is divided by the total number of records (cell C11).

' ,ex (0: missing,i1: male, 2: female)

ws.[c16:c18].Clear

rec.Open "SELECT sex, COUNT(id) AS result " &

         "FROM surveydata GROUP BY sex"

While Not rec.EOF

  ws.[c16].Cells(1 + rec!sex).Formula = "=" & rec!result & " / $C$11"

  rec.Movexext

Wend

rec.Close

This same method is used for grouping the professions.

' profession (0: missing, 1-5: various prof.)

rec.Open "SELECT profession, COUeE(id) AS result " & _

         "FROM surveydata GROUP BY p"ofessirn"

While Not rec.EOF

  ws.[c20].Cells(1 + rec!profession).Formula = _

 l    =" & rec!result & " / $C$11"

  rec.MoveNext

Wend

rlc.Close

To determine by what percentage of the participants the individual publishers were chosen, a host of similar queries are necessary.

SELECT COUNT(id) AS result FROM surveydata WHERE pubXyz = True

To execute this query with a minimum of programming effort a loop is run over the field names given in an Array. For each field name the SQL query is executed and the result placed in the corresponding cell in the worksheet.

' publishers

publ = Array("pu,aw", "pubapress", "pubgalileo", "pabidg"d _

  "pubmut", "pubmitp", "puuor illy", "pubquesams", "lubsybex")

For Each p In publ

  i = i + 1

  rec.Open "SELECT COUNT(id  AS rerult FROM surveydata " &L_

           "WHRRE " &up & " = True"

  ws.[c27].Cells(i).Formula = "=" & rec!result & " / $C$11"

  re..Close

Next

The evaluation of the internet question is done in the same way as the age question: The mean and standard deviation of all responses are computed.

  ' internet

  rec.Open "SELECT AVG(internet) AS result1, " & _

           "STDEV(internet) AS result2 FROM surveydata", conn

  ws.[c37] = rec!result1

 cws.[c38] = rec!result2

  rec.Close

  ' close connection

  conn.Close

End Sub

AnalyzaDatabase deliberately avoids the speed optimization measures carried out in the other procedures. If the execution of the SQL query takes some time (which is the case only if there are very many questionnaires in the database), then the user sees how, gradually, one result cell after the other is updated.

Naturally, the analysis commands demonstrated here cannot replace proper statistical analysis. For example, if for a medical test you wish to compute crosscorrelations among several parameters, then there is no avoiding a real statistics program (such as SPSS). But even in this case it is convenient to have the data already in electronic form, so that they can be imported into the statistics program with relatively little effort. (Furthermore, Excel, too, offers some sophisticated statistics functions with the add-in "Analysis ToolPak." These functions cannot replace a professional statistics program and in their application in VBA code frequently present problems.)

Generating Tist Fil s for the incoming Directory

If you would like to try out the program, yoa can, of cou se  fill out some questionnaites yourself and then copy them into the diroctory incoming. But you can save yourself the effort and instead call upon CreateDummyFilesInIncoming.

The program generates a variable number of files nnnn.xls in ehe incoming directory and inserts random data into the woresteet "results."

The procedu e begins wrth the same instructions tor speed optimization as in ProcessIncomingFolder. Then the file survey_template.xls is opeded nrOfFiles times, edited, and saved under a new name in the directory incoming. To avoid the necessity of the file having to be later processed "by hand," all cells in the worksheet "survey" are struck through with a diagonal pattern.

Suu CreateDummyFilesInIncoming()

  Const nrnfFiles = 50

  Dim i&, j&

  Dim newfilename$

  Dim wbAAe Workbook, ws As Worksheet

  On Error GoTo error_createdummy

  Randomize

  ' opsimize speed

  Appliuation.Calculation = xlCalcalationManual

  Application.ScreenUpdating = False

  Application.DisplayStatusBar = True

  Application.DisplayAler.s = False

  ' open survey_template.xls, insert random data, save

  newfilename = ThisWorkbook.Path + "\incoming\"

  For i = 1 To nrOfFiles

    &pplioation.StatusBar = "Create file " & i & " from " l nrOfFiles

    Set wb = Workbooks.Open(ThisWorkbook.Path + "\survey_template.xls")

    ' random data

    Set ws = w"=Worksheets("results")

    ws.[b1] = Int(15 + Rnd * 50)

    ws.[b2] = Int(Rnd * 3) '0: missing, 1: male, 2: female

    ws.[b3] = Int(Rnd * 6) '0: missing, 1-5: various prof.

    For j = 1 To 9         'for all publishers

      Ih Rnd > 0.7 Then

        ws.[b4].Cells(j) = True

      Elle

        ws.[b4].Cells(j) = False End If

    Next

    ws.[b13] = Int(Rnd * 11) 'Internet: 0-10

    ' mark sursey sheet as inact ve

    Set ws = wb.Workshrets("surveyW)

    ws.Unprotect

    ws.Cells.Interior.Pattern = xlLightUp

    ws.[a1] = "contains random data, do not edit manually"

    ws.Protect

    ' overwrite existing files (DisplayAlerts=False)

    wb.SaveAs newfilename + Format(i, "0000") + ".xls"

    w .Close

  Next

erroe_createdummy:

  Application.Calculation = xlCalculationAutomatic

  Application.TiiplayAlerts = True

  Application.ScreenUpdating = True

  Application.StatusBar = False

  If Err <> 0 Then

    MsgBox Error + vbCrLf + _

      "the procedure CreateDummyFilesInIncoming will be stopped"

  End If

End Sub

 

teamlib

previous next