Chapter 7: The Role of IT in Modern Manufacturing

Introduction: The Invisible Orchestrator

The plant runs 24/7. Machines hum, operators move materials, quality inspectors check parts. On the surface, it's all mechanical—metal, motors, and muscle.

But beneath this physical world runs an invisible nervous system: IT. Every work order dispatched, every quality measurement recorded, every supply chain alert triggered, every predictive maintenance notification—all orchestrated by the IT infrastructure most people never see.

IT's role in manufacturing has transformed:

1990s IT: "Keep email running. Back up the ERP database. Don't let the plant see you."

2024 IT: "Integrate 500 machines across 10 plants. Enable real-time analytics. Deploy AI models to the edge. Oh, and do it without causing a single minute of downtime."

The stakes have changed. IT is no longer a support function—it's the enabler of every strategic initiative.

  • Want predictive maintenance? IT must collect sensor data, contextualize it, train ML models, deploy them.
  • Need supply chain visibility? IT integrates ERP, TMS, supplier portals, customs systems.
  • Pursuing sustainability? IT tracks energy consumption, allocates carbon by SKU, generates ESG reports.

This chapter defines the modern role of IT in manufacturing, from systems architecture to integration patterns to governance models. If you're building or managing IT for manufacturers, this is your playbook.


The IT/OT Convergence Imperative

The Traditional Divide

For decades, IT (Information Technology) and OT (Operational Technology) operated in parallel universes:

Table 7.1: The IT/OT Cultural Divide

AspectIT (Information Technology)OT (Operational Technology)
Primary GoalEnable business processes (order-to-cash, etc.)Control physical processes (temperature, pressure, speed)
UsersOffice workers, executivesOperators, engineers, technicians
SystemsERP, CRM, HR, financeSCADA, PLCs, DCS, HMI
Uptime Expectation99% (scheduled maintenance windows OK)99.99% (24/7, maintenance during planned shutdowns only)
Security PriorityConfidentiality > Integrity > AvailabilityAvailability > Integrity > Confidentiality
Change CycleWeekly/monthly releasesQuarterly/yearly (requires testing, validation)
Vendor EcosystemMicrosoft, SAP, Oracle, SalesforceSiemens, Rockwell, Schneider Electric, ABB
NetworkCorporate LAN/WAN, internet-connectedIsolated plant networks, air-gapped
SkillsDatabase admins, app developers, network engineersInstrumentation engineers, control engineers, electricians
Reporting StructureCIO (Chief Information Officer)VP Operations or VP Engineering

Historical Reason for Separation: OT systems were designed when cybersecurity wasn't a concern (1980s-1990s). Connecting them to the internet risked catastrophic failures. So they were air-gapped.

Modern Reality: Industry 4.0 demands real-time data flow from OT → IT. Air gaps are crumbling. But the cultural and technical divide persists.


Why Convergence is Non-Negotiable

Driver #1: Data is the New Oil

OT systems generate 90% of manufacturing data (sensor readings, machine states, quality measurements). But it's locked in proprietary formats, isolated historians, and unreachable PLCs.

IT's Role: Extract, transform, load (ETL) OT data into analytics platforms where business decisions are made.

Example:

  • OT Data: Machine X temperature = 185°C, pressure = 50 PSI (locked in Wonderware historian)
  • IT Transformation: Contextualize → "Line 3, Widget A production, Machine X (asset ID: M-12345), Temp 185°C (spec: 180-190), Pressure 50 PSI (spec: 45-55), Status: In Spec"
  • Business Value: Correlate temperature with quality defects; if temp >188°C, scrap rate spikes → auto-alert operator

Driver #2: Remote Operations (Post-COVID)

Pre-COVID, engineers traveled to plants to troubleshoot. Post-COVID, travel restricted. OEMs needed remote access to support equipment.

Challenge: OT networks were never designed for remote access.

IT's Role: Build secure remote access (VPN, MFA, session monitoring) without compromising OT network integrity.


Driver #3: AI/ML Requires Data at Scale

Predictive maintenance, quality prediction, demand forecasting—all require massive datasets spanning years of OT data + IT data (orders, BOMs, supplier quality).

IT's Role: Build data lakes that unite IT and OT data with governance, lineage, and quality controls.


The Convergence Framework

Not Integration. Convergence.

Integration: Build a one-way data pipe (OT → IT). IT reads; OT ignores IT.

Convergence: Bi-directional collaboration. IT provides analytics; OT acts on insights. Closed-loop systems.

Table 7.2: Integration vs. Convergence

AspectIntegrationConvergence
Data FlowOne-way (OT → IT)Bi-directional (OT ↔ IT)
GovernanceSeparate teamsJoint governance (IT + OT RACI)
SecurityIT secures IT; OT secures OTUnified security posture (NIST CSF across both)
InfrastructureSeparate networks, air-gappedSecure bridge (DMZ, firewalls, monitored)
Culture"Us vs. Them""Shared mission"
ExampleRead SCADA data into ERP for reportingMES adjusts PLC setpoint based on AI quality model

Convergence Architecture:

Key Design Principles:

  1. Unidirectional Gateways (Diodes) for Critical Systems: OT → IT data flows freely; IT → OT is heavily restricted (only authorized commands)
  2. DMZ Between IT and OT: Converge in a neutral zone with strict firewall rules
  3. Least Privilege: IT users can't directly access PLCs; OT users can't access ERP financials
  4. Monitoring: All traffic between IT/OT is logged, analyzed for anomalies

The Manufacturing IT Systems Landscape

Core Systems and Their Roles

Modern manufacturing IT is a constellation of systems, each serving a specific function.

Table 7.3: Manufacturing IT Systems Portfolio

SystemAcronymPrimary FunctionISA-95 LevelKey VendorsTypical Users
Enterprise Resource PlanningERPFinance, procurement, order management, inventoryLevel 4SAP, Oracle, Microsoft D365, InforFinance, sales, procurement, planning
Product Lifecycle ManagementPLMProduct design, BOMs, engineering changeLevel 4Siemens Teamcenter, PTC Windchill, Dassault 3DEXPERIENCEEngineers, product managers
Manufacturing Execution SystemMESWork order dispatch, tracking, quality, genealogyLevel 3Siemens Opcenter, Rockwell FactoryTalk, SAP MES, Dassault DELMIAProduction supervisors, operators
Quality Management SystemQMSNC/CAPA, audits, supplier quality, inspectionsLevel 3/4ETQ, MasterControl, Sparta SystemsQuality engineers, QC inspectors
Supervisory Control and Data AcquisitionSCADAReal-time process monitoring, equipment controlLevel 2Wonderware, Ignition, GE iFIX, Siemens WinCCOperators, process engineers
Programmable Logic ControllerPLC/DCSDevice-level control (motors, valves, sensors)Level 1Allen-Bradley, Siemens, Schneider Electric, ABBControls engineers, electricians
Computerized Maintenance Management SystemCMMSWork orders, PM schedules, spare partsLevel 3IBM Maximo, eMaint, FiixMaintenance managers, technicians
Warehouse Management SystemWMSInventory location, picking, shippingLevel 3Manhattan, Blue Yonder, SAP EWMWarehouse staff, logistics
Transportation Management SystemTMSFreight planning, carrier selection, trackingLevel 4Oracle TMS, Blue Yonder, DescartesLogistics, supply chain
Laboratory Information Management SystemLIMSLab sample tracking, test results, COAsLevel 3LabWare, Thermo SampleManagerLab technicians, quality
Historian/Data LakeTime-series data storage, analytics foundationLevel 2/3OSIsoft PI, Honeywell PHD, AWS Timestream, SnowflakeData engineers, analysts
Advanced Planning & SchedulingAPSFinite capacity scheduling, optimizationLevel 3/4Blue Yonder, Kinaxis, Siemens Opcenter APSProduction planners

System Interdependencies

No system operates in isolation. A typical manufacturing transaction touches 5+ systems.

Example Flow: Customer Order → Shipment

1. [CRM] Sales rep enters order
      ↓
2. [ERP] Order Management validates (credit check, ATP - Available to Promise)
      ↓
3. [ERP] MRP generates work order for items not in stock
      ↓
4. [PLM] Work order references BOM (which revision? which components?)
      ↓
5. [MES] Work order dispatched to Line 3
      ↓
6. [SCADA/PLC] Machine executes production (MES monitors)
      ↓
7. [MES] Operator confirms completion (good count, scrap count)
      ↓
8. [QMS] Quality inspection triggered (pass/fail)
      ↓
9. [MES → ERP] Inventory confirmation (backflush materials, receive finished goods)
      ↓
10. [WMS] Allocate finished goods to customer order, generate pick list
      ↓
11. [TMS] Schedule shipment, select carrier
      ↓
12. [ERP] Invoice customer

TOTAL SYSTEMS: 7 (CRM, ERP, PLM, MES, QMS, WMS, TMS)
INTEGRATIONS: 10+ data exchanges

If any integration fails: Order stalls. Customer calls. Expedite costs spiral.


Integration Patterns

Table 7.4: Common Integration Approaches

PatternUse CaseProsConsTypical Cost
Point-to-Point (P2P)<5 systems, simple data exchangeFast to implementN×(N-1) interfaces = nightmare to maintain$20K-50K/interface
Enterprise Service Bus (ESB)5-20 systems, complex transformationsCentralized logic, reusableSingle point of failure if not HA; specialized skills$200K-1M
API Gateway + MicroservicesModern, cloud-native, rapid iterationDecoupled, scalableRequires API-first systems (legacy may not have APIs)$100K-500K
iPaaS (Integration Platform as a Service)SaaS-heavy, rapid onboardingLow upfront cost, pre-built connectorsVendor lock-in, recurring fees$50K-200K/year
Data Lake/ETLAnalytics use case, not real-timeMassive scale, supports AI/MLNot suitable for transactional integration$200K-2M
Event-Driven (Kafka, Pulsar)High-throughput, real-time eventsScalable, decoupled producers/consumersComplexity, eventual consistency$300K-1M

Recommendation: Hybrid Approach

  • Real-Time Transactions: API Gateway (ERP ↔ MES work orders)
  • Real-Time Events: Event Bus (SCADA machine states, quality alerts)
  • Batch Analytics: Data Lake (historian → lakehouse for ML)
  • Legacy Systems: ESB or iPaaS (accommodate systems without APIs)

Data: The Lifeblood of Manufacturing IT

The Data Challenge

Manufacturing generates petabytes annually:

  • Time-Series: Sensor readings every 100ms = 864,000 data points/day/sensor × 10,000 sensors = 8.6B data points/day
  • Transactional: Work orders, quality inspections, shipments = millions of records/year
  • Master Data: BOMs, routings, parts = millions of records

But data is messy:

Data Quality IssueExampleImpactSolution
Inconsistent Definitions"Downtime" means different things at 3 plantsCan't benchmark OEEStandardize taxonomy (ISA-95)
Siloed DataSCADA data in Historian A, ERP data in DB B, QMS in App CNo correlation (can't link temp to quality)Data lake with unified model
Missing ContextSensor reading "185" (units? asset? process?)Unusable for analyticsContextualize (asset ID, UOM, spec limits)
Stale DataBatch uploads every 24 hoursReal-time decisions impossibleStreaming integration
Duplicate/ConflictingPart X exists in PLM, ERP, MES with different descriptionsConfusion, errorsMaster data management

The Data Platform Architecture

Goal: Single source of truth for all manufacturing data, accessible to authorized users/systems.

Table 7.5: Data Platform Components

ComponentPurposeTechnology Examples
Data IngestionCollect data from sourcesOPC UA servers, MQTT brokers, REST APIs, Kafka, Fivetran
Data StorageStore raw and curated dataTime-Series: InfluxDB, TimescaleDB, AWS Timestream<br>Structured: Snowflake, Databricks, Azure Synapse<br>Unstructured: S3, Azure Blob
Data CatalogingDocument what data exists, where, qualityCollibra, Alation, AWS Glue Data Catalog
Data GovernanceOwnership, access control, retention policiesCollibra, Informatica, custom RACI
Data QualityProfiling, cleansing, validationTalend, Informatica, Great Expectations
Data TransformationETL/ELT, contextualizationdbt, Apache Spark, AWS Glue, Databricks
Data AccessQuery, API, streamingSQL (Snowflake), GraphQL, REST APIs, Kafka Streams
Data ScienceML model training, experimentationDatabricks, Azure ML, AWS SageMaker, DataRobot

Layered Architecture:

┌─────────────────────────────────────────────────────────────┐
│  CONSUMPTION LAYER (Users & Apps)                            │
│  • Dashboards (Power BI, Tableau)                            │
│  • ML Models (Predictive Maintenance)                        │
│  • APIs (Real-time queries)                                  │
└─────────────────────────────────────────────────────────────┘
                          ↑
┌─────────────────────────────────────────────────────────────┐
│  CURATED LAYER (Gold)                                        │
│  • Business-ready datasets                                   │
│  • Aggregated KPIs (OEE by line, shift, SKU)                │
│  • Joined dimensions (asset + process + quality)             │
└─────────────────────────────────────────────────────────────┘
                          ↑
┌─────────────────────────────────────────────────────────────┐
│  ENRICHED LAYER (Silver)                                     │
│  • Cleaned, deduplicated                                     │
│  • Contextualized (asset ID → asset name, location)         │
│  • Validated (schema checks, business rules)                │
└─────────────────────────────────────────────────────────────┘
                          ↑
┌─────────────────────────────────────────────────────────────┐
│  RAW LAYER (Bronze)                                          │
│  • As-received from sources (no transformation)              │
│  • Immutable (append-only for audit)                         │
└─────────────────────────────────────────────────────────────┘
                          ↑
┌─────────────────────────────────────────────────────────────┐
│  SOURCES                                                     │
│  • PLCs, SCADA, MES, ERP, QMS, Historians                   │
└─────────────────────────────────────────────────────────────┘

Data Governance Model:

Data DomainData OwnerData StewardConsumersRetention
Production DataVP OperationsPlant ManagerOperations, IT, Finance7 years
Quality DataVP QualityQuality ManagerQuality, Customers (on request)10 years (regulated: 30)
Equipment DataVP MaintenanceMaintenance ManagerMaintenance, IT, Predictive Analytics5 years
Product Data (BOMs)VP EngineeringPLM AdministratorEngineering, Production, Supply ChainLifecycle + 10 years
Financial DataCFOControllerFinance, Executives7 years (legal requirement)

Edge vs. Cloud: The Architecture Decision

The Trade-Offs

Table 7.6: Edge vs. Cloud Comparison

DimensionEdge (On-Premise / Plant)Cloud (AWS, Azure, GCP)
Latency<10ms (local processing)50-200ms (round-trip to cloud)
BandwidthLAN (Gbps)WAN (10-100 Mbps typical)
Cost ModelHigh capex, low opexLow capex, high opex (pay-per-use)
ScalabilityLimited by hardwareInfinite (elastic)
Data SovereigntyFull control (remains on-premise)Depends (region selection, but shared infra)
ReliabilityDepends on local UPS, redundancy99.99% SLA (multi-AZ, multi-region)
MaintenanceLocal IT team (upgrades, patching)Managed by cloud provider
Use CasesReal-time control, closed-loop systemsAnalytics, AI training, long-term storage
ComplianceITAR (must be on-premise, U.S. only)Flexible (choose region for GDPR, etc.)

Hybrid Architecture (Best of Both)

Don't choose Edge OR Cloud. Use BOTH.

Edge: Real-time processing, low-latency decisions Cloud: Scalable analytics, AI model training, long-term storage, cross-plant insights

Example: Predictive Maintenance

EDGE (Plant Floor):
1. Collect vibration data from motor (100 samples/sec)
2. Run lightweight anomaly detection model (flags unusual patterns)
3. If anomaly detected → alert operator immediately
4. Aggregate data (reduce 100 samples/sec to 1 summary/minute)
5. Send summary to cloud (1,440 data points/day vs. 8.6M if raw)

CLOUD (Data Center):
1. Receive aggregated data from 50 plants
2. Train advanced ML model on 5 years of data (billions of records)
3. Detect global patterns (motor model X fails after Y vibration pattern)
4. Deploy updated model to edge (monthly updates)
5. Long-term storage (7 years for compliance)

BENEFIT:
- Edge: Immediate alerts (no latency waiting for cloud)
- Cloud: Better models (trained on vast data)
- Bandwidth: 99.98% reduction (send summaries, not raw)
- Cost: Edge handles real-time; cloud handles heavy lifting

Security: The Non-Negotiable Foundation

The OT Cybersecurity Threat Landscape

High-Profile Attacks:

  • Colonial Pipeline (2021): Ransomware shut down U.S. fuel pipeline for 6 days, $5M ransom paid
  • JBS Foods (2021): Meat processing disrupted, 20% of U.S. beef supply offline
  • Norsk Hydro (2019): Aluminum producer forced to manual operations, $75M cost

Attack Vectors:

VectorDescriptionExampleMitigation
PhishingEmployee clicks malicious link; malware spreads to OTNotPetya (2017)Security awareness training, email filtering
Removable MediaUSB stick with malware plugged into HMIStuxnet (2010)Disable USB ports, scan media before use
Vendor Remote AccessVendor account compromised; attacker gains OT accessSolarWinds-style supply chain attackMFA, session monitoring, vendor access broker
Insider ThreatDisgruntled employee sabotagesMaroochy Water (2000, sewage spill)Least privilege, logging, separation of duties
Unpatched VulnerabilitiesLegacy PLC/SCADA with known CVEsEternalBlue (WannaCry)Virtual patching, network segmentation

Defense-in-Depth Architecture

Principle: Layers of security. If one fails, others remain.

Table 7.7: Security Layers

LayerControlsTechnology/Process
PhysicalLocked server rooms, badge access to control roomsCCTV, access logs
NetworkSegmentation (IT/OT DMZ, VLANs), firewallsCisco, Palo Alto, Check Point; rules per ISA-62443
IdentityMFA, SSO, least privilege, time-limited accessActive Directory, Okta, Azure AD; RBAC
DeviceHardened OS, application whitelisting, AVWindows Defender, CrowdStrike, Carbon Black
ApplicationSecure coding, input validation, signed configsOWASP Top 10 for web apps; signed PLC programs
DataEncryption (at rest, in transit), DLPTLS 1.3, AES-256; Data Loss Prevention tools
MonitoringSIEM, anomaly detection, SOCSplunk, Elastic, Nozomi (OT-specific)
GovernancePolicies, training, incident response planNIST CSF, ISO 27001, tabletop exercises

NIST CSF for Manufacturing

NIST Cybersecurity Framework (CSF) is the de facto standard for U.S. manufacturers.

Five Functions:

  1. Identify: Know your assets, risks, vulnerabilities

    • Asset inventory (every PLC, HMI, server)
    • Risk assessment (critical assets = high priority)
  2. Protect: Implement safeguards

    • Access control (MFA, least privilege)
    • Network segmentation (IT/OT DMZ)
    • Training (security awareness)
  3. Detect: Find incidents quickly

    • SIEM (correlate logs from IT/OT)
    • Anomaly detection (unusual PLC behavior)
  4. Respond: Contain and mitigate

    • Incident response plan (playbooks)
    • Tabletop exercises (test readiness)
  5. Recover: Restore operations

    • Backups (offline, tested)
    • Disaster recovery plan (RTO/RPO defined)

Maturity Levels:

TierDescriptionCharacteristics
Tier 1: PartialAd-hoc, reactiveNo formal policies; respond to incidents as they occur
Tier 2: Risk-InformedApproved policies, but not org-wideSome plants have controls; others lag
Tier 3: RepeatableFormal policies, org-wideConsistent controls across all sites; regular audits
Tier 4: AdaptiveProactive, continuous improvementThreat intelligence, predictive detection, evolving controls

Goal: Achieve Tier 3 minimum. Tier 4 for critical infrastructure or defense contractors (CMMC).


DevOps for Manufacturing IT

The Traditional IT Pain

Old Model: Waterfall development, 6-12 month release cycles.

Problem: By the time MES upgrade is deployed, business requirements have changed.

Example:

  • Month 0: Business says "We need real-time OEE dashboards."
  • Month 6: IT finishes development, starts testing.
  • Month 10: IT deploys to production.
  • Month 11: Business says "We changed our downtime taxonomy 3 months ago; this dashboard shows the wrong data."
  • Outcome: $500K project delivers zero value.

DevOps Principles for Manufacturing

DevOps: Development + Operations. Rapid, iterative delivery with automated testing and deployment.

Table 7.8: Waterfall vs. Agile/DevOps

AspectWaterfallAgile + DevOps
Release Cycle6-12 months2-4 weeks
RequirementsFixed upfront (Big Requirements Doc)Evolving (user stories, sprint-by-sprint)
TestingManual, at end (weeks to test)Automated, continuous (minutes to test)
DeploymentManual, risky (all-hands on deck)Automated, low-risk (push-button)
RollbackDifficult (no automation)Easy (revert to previous version)
User FeedbackAfter go-live (too late)Every 2 weeks (sprint demo)
RiskHigh (big-bang failures)Low (incremental changes)

CI/CD Pipeline for MES/Analytics

Continuous Integration (CI): Developers commit code frequently; automated tests run on every commit.

Continuous Deployment (CD): Code that passes tests auto-deploys to production (or staging → production with approval).

Example Pipeline:

1. DEVELOPER commits code (MES dashboard update)
      ↓
2. CI SYSTEM (Jenkins, GitLab CI) triggers
      ↓
3. BUILD: Compile code, package artifacts
      ↓
4. AUTOMATED TESTS:
   - Unit tests (individual functions)
   - Integration tests (MES → ERP API call)
   - Security scans (OWASP ZAP, SonarQube)
      ↓
5. DEPLOY TO STAGING: Auto-deploy to non-production environment
      ↓
6. SMOKE TESTS: Verify staging environment functional
      ↓
7. APPROVAL: Product owner reviews staging, approves for production
      ↓
8. DEPLOY TO PRODUCTION: Blue-green deployment (zero downtime)
      ↓
9. MONITORING: Dashboards confirm successful deployment; rollback if issues

Timeline: Commit → Production in <1 hour (vs. weeks with manual process).


Infrastructure as Code (IaC)

Problem: Setting up a new plant's IT infrastructure (servers, networks, apps) takes 6 months of manual work.

Solution: Define infrastructure in code (Terraform, Ansible, CloudFormation). Deploy automatically.

Example:

# Terraform code to deploy MES environment

resource "aws_instance" "mes_server" {
  ami           = "ami-mes-2024-v5"
  instance_type = "m5.4xlarge"
  tags = {
    Plant     = "Mexico-Monterrey"
    Function  = "MES"
    Backup    = "Daily"
  }
}

resource "aws_db_instance" "mes_database" {
  engine         = "postgres"
  instance_class = "db.r5.xlarge"
  storage        = 500  # GB
  multi_az       = true  # High availability
}

Benefit: New plant IT stack deployed in 1 day (not 6 months). Standardized (no drift between plants).


Observability: Know What's Happening

The Three Pillars

Observability = Logs + Metrics + Traces

Table 7.9: Observability Components

PillarPurposeExampleTool
LogsEvent records (what happened, when, where)"MES-ERP integration failed: Timeout after 30 sec"Splunk, Elastic, Datadog
MetricsNumeric measurements over timeAPI latency, CPU usage, message queue depthPrometheus, Grafana, Datadog
TracesRequest flow across servicesOrder #12345: CRM → ERP (200ms) → MES (500ms) → timeoutJaeger, Zipkin, Dynatrace

Metrics That Matter

Table 7.10: Manufacturing IT KPIs

KPIDescriptionTargetHow to Measure
Integration Uptime% time integration endpoints available>99.5%Monitor API health checks
Data LatencyTime from edge event to cloud analytics<5 min (streaming), <1 hr (batch)Timestamp comparison (event time vs. arrival time)
Error Rate% of integration transactions that fail<0.5%Log analysis (count errors / total transactions)
Mean Time to Detect (MTTD)Time from incident start to alert<5 minIncident timestamp - event timestamp
Mean Time to Restore (MTTR)Time from incident alert to resolution<1 hr (critical), <4 hr (high)Resolution timestamp - alert timestamp
Change Failure Rate% of deployments causing incidents<5%Incidents caused by deployment / total deployments
Deployment FrequencyHow often code is deployed to productionWeekly (mature DevOps orgs)Count deployments per week
Lead Time for ChangesCode commit to production deployment<1 day (mature DevOps)Deployment timestamp - commit timestamp

SLAs and SLOs

SLA (Service Level Agreement): Contract with business. "MES will be available 99.5% of production hours."

SLO (Service Level Objective): Internal target (higher than SLA to provide buffer). "We target 99.9% uptime."

Example:

SLA: ERP-MES integration will process work order confirmations within 1 minute, 99% of the time.

SLO: Internal target: 30 seconds, 99.5% of the time.

Monitoring: If SLO breached (but SLA not yet violated), proactive investigation. If SLA breached, escalate to management.


Governance: Making IT and OT Work Together

The RACI Model

RACI: Responsible, Accountable, Consulted, Informed

Table 7.11: IT/OT Governance RACI (Example)

ActivityITOTEngineeringOperationsVendor
Define MES RequirementsCRCAI
Select MES VendorCRCA-
Configure MESRCCAC
Integrate MES ↔ ERPRIIAC
Deploy MES to ProductionRCCAC
Operate MES (Day-to-Day)CRIAI
Patch MES ServerRIICI
Change PLC ProgramIRCAC
Incident Response (IT Issue)RCIAC
Incident Response (OT Issue)CRCAC

Key:

  • R (Responsible): Does the work
  • A (Accountable): Approves/owns outcome (one A per row)
  • C (Consulted): Provides input
  • I (Informed): Kept in loop

Change Management Process

Problem: Unauthorized changes to PLCs, MES, or integrations cause outages.

Solution: Formal change control.

Change Tiers:

TierDescriptionApproval RequiredLead TimeExample
EmergencyProduction down; immediate fix neededVerbal (CIO or VP Ops)0 (act now)PLC fix to restore line
StandardPre-approved, low-riskAutomated (CAB pre-approved)1 dayApply Windows patch (tested)
NormalModerate risk, tested in dev/stagingChange Advisory Board (CAB)1 weekMES feature update
MajorHigh risk, complex, multi-systemExecutive approval (CIO + VP Ops)2-4 weeksERP upgrade

CAB (Change Advisory Board): Weekly meeting. IT, OT, Engineering, Operations review proposed changes. Approve/defer/reject.


Implementation Roadmap

Phase 1: Assess (Months 1-3)

  • Inventory all IT and OT systems (asset register)
  • Map current integrations (document data flows)
  • Assess cybersecurity posture (NIST CSF maturity)
  • Identify technical debt (systems EOL, unsupported versions)
  • Baseline KPIs (integration uptime, error rates, MTTR)

Phase 2: Standardize (Months 3-9)

  • Define IT/OT governance (RACI, change control process)
  • Select standard protocols (OPC UA for PLCs, MQTT for IoT, REST for APIs)
  • Implement network segmentation (IT/OT DMZ with firewalls)
  • Deploy SIEM for unified logging (IT + OT events)
  • Establish data governance (ownership, retention, quality standards)

Phase 3: Converge (Months 9-18)

  • Deploy edge gateways for OT data collection
  • Build data platform (lakehouse with raw/enriched/curated layers)
  • Integrate core systems (ERP ↔ MES, MES ↔ QMS, SCADA → Historian)
  • Implement MFA and least-privilege access
  • Launch CI/CD pipeline for MES/analytics deployments

Phase 4: Optimize (Months 18-24+)

  • Deploy AI/ML models (predictive maintenance, quality prediction)
  • Enable closed-loop control (MES adjusts PLC based on analytics)
  • Scale across all plants (standardized architecture)
  • Continuous improvement (monthly retrospectives, quarterly architecture reviews)

Common Pitfalls and Mitigations

Table 7.12: IT Implementation Pitfalls

PitfallExampleImpactMitigation
One-Off IntegrationsCustom code for every Plant A → Plant B data flowSpaghetti architecture, unmaintainableUse reusable patterns (API gateway, event bus)
Shadow IT PlatformsPlant builds own data lake; corporate has anotherData silos, duplicate spendCentralized governance with chargeback model
Latency SurprisesAssume cloud works for real-time; it doesn'tClosed-loop control failsPilot edge vs. cloud; test latency before commit
Credential Sprawl50+ service accounts, shared passwordsSecurity risk, audit nightmareSSO, credential vaulting (CyberArk, HashiCorp Vault)
Unmanaged Vendor AccessVendor has VPN with admin rights, no monitoringInsider threat, compliance violationVendor access broker (jump host, MFA, session recording)
Ignoring OT CultureIT deploys MES without consulting operatorsResistance, workaroundsJoint workshops, operator champions, involve OT early
Big-Bang DeploymentReplace all systems at onceCatastrophic failure, no rollbackIncremental (pilot → scale), always have rollback plan

Conclusion: IT as the Manufacturing Nervous System

Manufacturing plants are no longer isolated factories. They're nodes in a global, data-driven network. IT is the nervous system that connects machines to decisions, operators to insights, plants to headquarters, suppliers to demand.

Your role as IT in manufacturing:

  • Enable, don't constrain: Provide tools and platforms that empower operations, engineering, and quality to move faster.
  • Secure, don't lock down: Protect OT from threats, but don't make legitimate access so hard that users bypass you.
  • Standardize, don't stifle: Create reusable patterns and architectures, but allow local flexibility within guardrails.
  • Measure, don't assume: Instrument everything. Data-driven decisions beat opinions.

The manufacturers who win treat IT as a strategic partner, not a cost center. They invest in converged IT/OT architectures, data platforms, and DevOps capabilities. The result: faster innovation, lower risk, and sustainable competitive advantage.


Chapter Summary

TopicKey Takeaway
IT/OT ConvergenceNo longer optional; required for Industry 4.0. Build secure bridges (DMZ, firewalls, monitoring).
Systems Landscape10+ core systems (ERP, MES, PLM, QMS, SCADA, etc.) must integrate seamlessly.
Integration PatternsHybrid approach: API Gateway (transactions), Event Bus (real-time), Data Lake (analytics).
Data PlatformBronze (raw) → Silver (enriched) → Gold (curated) lakehouse architecture.
Edge vs. CloudHybrid: Edge for real-time; Cloud for scale, analytics, AI training.
SecurityDefense-in-depth (network, identity, device, app, data layers); NIST CSF minimum Tier 3.
DevOpsCI/CD pipelines enable weekly releases vs. 6-month waterfall. Infrastructure as Code standardizes deployments.
ObservabilityLogs + Metrics + Traces. Monitor SLAs/SLOs. MTTD <5 min, MTTR <1 hr for critical.
GovernanceIT/OT joint RACI. Change control via CAB. Monthly reviews.

Discussion Questions

  1. IT/OT Tensions: How do you resolve conflicts when IT wants to patch servers but OT says "Don't touch anything during production season"?

  2. Edge Economics: At what data volume does it become cheaper to process at the edge vs. cloud? (Hint: Calculate bandwidth costs.)

  3. Shadow IT: Plant manager deploys unauthorized cloud analytics tool. Do you shut it down or embrace it? How do you prevent future shadow IT?

  4. DevOps Readiness: Your organization has no automated testing. How do you build CI/CD capability without disrupting current operations?

  5. Vendor Lock-In: You're on legacy Historian X (end-of-life in 2 years). Switching costs $2M. Wait for EOL or migrate now?


Further Reading


Next Chapter Preview:

You now understand the IT systems landscape and how they integrate. Chapter 8 shifts to the business perspective: Manufacturing IT Services Portfolio. What services should you offer? How do you package them? How do you price them? This is your go-to-market playbook for selling IT services to manufacturers.