Newbies CBarnes Posted June 5, 2023 Newbies Share Posted June 5, 2023 Good morning, I have a flat file (Excel) that has a weekStop Export Data 2023-01-01 to 01-07.csv of data with the following columns VehicleID, VehicleName, DriverID, DriverFirstName, DriverLastName, RouteID, RouteName, PatternID, PatternName, StopID, StopName, StopNumber, Timepoint, TripID, TripName, RunID, RunName, BlockID, Arrive, ArriveVariance, ScheduledArrive, Depart, DepartVariance, ScheduledDepart, Ons, Offs, ArrivalPassengers,VehicleCapacity, and DepartingPassengers. I created the following tables: Vehicle with VehicleID, VehicleName, VehcileCapacity Driver with DriverID, DriverFirstName, DriverLastName Route with PatternID, PatternName, RouteID, RouteName Stops with StopID, StopName, StopNumber Schedule with Timepoint, TripID, TripName, RunID, RunName, BlockID, ArriveDate, ArriveTime, ArriveVariance, DepartDate, DepartTime, DepartVariance, ScheduledArriveDate, ScheduledArriveTime, ScheduledDepartDate, ScheduledDepartTime Each table has the automated Primary Key. I am not sure how to make the relationships (ERD). Should I use the Primary Keys, or the imported IDs (VerhicleID, DriverID, etc)? I have attached a smaller file of one day for reference, as a week file is 30mb. I'd appreciate any help on this. Thanks, Craig Stop Export Data 2023-05-02.xlsx Link to comment Share on other sites More sharing options...
comment Posted June 5, 2023 Share Posted June 5, 2023 If the imported IDs are both unique and permanent, you'd be better off using them instead of your own. This will allow you to import the flat data directly into the target tables (separately for each table, of course) and have it link up automatically. However, this is assuming you won't be creating any records of your own in these tables. If you do, you will have to find a method to generate an ID that is guaranteed to be different from any ID that your data provider may come up with. Or move to using your own IDs - which will make the import action much more complicated. Link to comment Share on other sites More sharing options...
Newbies CBarnes Posted June 5, 2023 Author Newbies Share Posted June 5, 2023 All of the data will always be imported. Should I create tables Trips (TripID, TripName), Runs (RunID, RunName), Blocks (BlockID) since they have their own ID, or leave them in Schedule? Schedule with Timepoint, TripID, TripName, RunID, RunName, BlockID, ArriveDate, ArriveTime, ArriveVariance, DepartDate, DepartTime, DepartVariance, ScheduledArriveDate, ScheduledArriveTime, ScheduledDepartDate, ScheduledDepartTime What key field should be used in Schedule? Next question is how would I connect the tables? Link to comment Share on other sites More sharing options...
comment Posted June 6, 2023 Share Posted June 6, 2023 9 hours ago, CBarnes said: Should I create tables Trips (TripID, TripName), Runs (RunID, RunName), Blocks (BlockID) since they have their own ID, or leave them in Schedule? I don't know. It depends on what you intend to do with the data. The reasons for normalizing a database are functional, not ideological. If all your data comes from an outside source and you do not intend to modify it, then the most compelling reason to normalize - i.e to avoid update anomaly - is absent. 9 hours ago, CBarnes said: What key field should be used in Schedule? IIUC, Schedule is the most atomic table here - IOW, it is a child table to several other tables, but parent to none. As such, it doesn't really need to have a primary key. Again, it depends on what your purpose is. Most often, we tend to include an auto-entered serial number or UUID field "just in case it might be needed at some point". Link to comment Share on other sites More sharing options...
Newbies CBarnes Posted June 6, 2023 Author Newbies Share Posted June 6, 2023 The original source does not have reporting and sorting capabilities. It will be used for reporting and looking up data. For example all 56005 trips in one day, or all weekdays. Many reports plan to be built from the data. Link to comment Share on other sites More sharing options...
comment Posted June 6, 2023 Share Posted June 6, 2023 (edited) 6 hours ago, CBarnes said: It will be used for reporting and looking up data. That doesn't mean much. Some reports require (or at least can benefit from) a normalized structure, some (perhaps even most) can be produced just as easily from a flat table. It all depends on the specific details of the report. I am not trying to dissuade you from normalizing your data. I am just saying the extra effort might not be worth it. Edited June 6, 2023 by comment Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now