MS Excel: Cleverer Table Importer

These are some functions that help you write a script to import Excel data into a SQL database. What makes this different from the Access import feature is that the data can be poorly formatted. This specific code is for the Crystal Reports export feature. Crystal exports data by converted the final output to an Excel sheet, but the sheet includes the headers and titles, as well as blank columns. In short, it’s not ready to import.

Additionally, the CSV export feature of Crystal spits out incomplete data, so the Excel export is the best export.

So, what we need is an importer that can read data with empty columns, with a header line way down the page a few lines.

This partially completed importer works by finding, then analyzing the header line for column names, and noting which column name goes with which column number. With the offsets of each column, then, loop over the table, mapping each column back to column names, and using that to create an SQL string to insert the data. We also pass in some hints about which fields to quote, and which to convert from dateserials to textual dates.

This code doesn’t yet have the necessary code to import the data into the table. The final version of the code will run within Access, and control an instance of Excel.


Public Sub test()
    Dim offsets As Dictionary
    Dim quotes As New Dictionary
    Dim row As Dictionary
    Dim dest As New Dictionary
    
    quotes.Add "code", "quote"
    quotes.Add "PaidThrough", "date"
    quotes.Add "Mems", "number"
    quotes.Add "UpdateTime", "quote"
    n = Format(Now(), "yyyy/mm/dd")
    
    import_goto_start ("Customer #")
    Set offsets = import_get_heading_offsets
    ' move cursor down one cell
    While (Application.Selection <> "")
        Application.ActiveCell.Offset(1, 0).Select
        Set row = import_get_row(offsets)
        dest.RemoveAll
        dest.Add "code", row("Customer #")
        dest.Add "PaidThrough", row("through")
        dest.Add "Mems", row("Members")
        dest.Add "UpdateTime", n
        Sql = import_build_sql("foo", dest, quotes)
        Debug.Print Sql
    Wend
End Sub

Public Sub import_goto_start(search As String)
    ' moves cursor to the first likely line of data, which is the first
    ' cell of the header row.  Call this before anything else.
    r = 1
    While (r < 20)
        c = 1
        While (c < 5)
            With Workbooks(1).Worksheets(1)
                If (.Cells(r, c) = search) Then
                    .Cells(r, c).Select
                    Exit Sub
                End If
            End With
            c = c + 1
        Wend
        r = r + 1
    Wend
End Sub

Function import_get_heading_offsets() As Dictionary
    ' returns a dictionary mapping field names to column numbers
    Dim res As New Dictionary
    Dim r As Integer
    Dim c As Integer
    
    With Workbooks(1).Worksheets(1)
        c = Application.ActiveCell.Column
        r = Application.ActiveCell.row
        For col = c To 100
            Heading = .Cells(r, col).Value2
            If Heading <> "" Then
                res.Add col, Heading
            End If
        Next
    End With
    ' return that dictionary
    Set import_get_heading_offsets = res
End Function

Function import_get_row(offsets As Dictionary) As Dictionary
    ' returns a row of data as an associative array
    Dim res As New Dictionary
    With Workbooks(1).Worksheets(1)
        r = Application.ActiveCell.row
        ' what is the way to scan the row based on the collection's contents???
        For col = 1 To 10
            If offsets.Exists(col) Then
                res.Add offsets.Item(col), .Cells(r, col).Value2
                'Debug.Print "Adding " & .Cells(r, col).Value2 & " : " & offsets.Item(col)
            Else
                'Debug.Print "Column " & col & " ignored. " & offsets.Item(col) & " : " & .Cells(r, col).Value2
            End If
        Next
    End With
    Set import_get_row = res
End Function

Function import_build_sql(table As String, data As Dictionary, quotes As Dictionary) As String
    ' takes an associative array as input and generates an "insert"
    ' for the table.  the field names must match.
      s = ""
    For Each d In data
        If s <> "" Then s = s & ", "
        If (quotes(d) = "quote") Then
            s = s & " " & d & "='" & data(d) & "'"
        ElseIf (quotes(d) = "date") Then
            s = s & " " & d & "='" & Format(data(d), "yyyy/mm/dd") & "'"
        Else
            s = s & " " & d & "=" & data(d)
        End If
    Next
    s = "INSERT INTO " & table & s
    import_build_sql = s
End Function

' PHP pseudocode
' offsets = import_get_heading_offsets()
' while( row = import_get_row(offsets) ) :
'    new['field1'] = row['fieldx']
'    ...
'    sql = import_build_sql('table', new)
'    cn.execute sql
' endwhile

The code’s a little bit dirty. VBA Dictionaries were hard to learn, because MS docs tend to have simple example code. There are a few places I wished to make more efficient.