A few months ago, our team had to untangle a very similar situation. We inherited an environment where some IAM roles were basically wide open because people were afraid to restrict anything. The first thing we did was map out which services each workload actually touches for a full week. You’d be surprised how much unused permission surface shows up once you observe actual behavior instead of assuming what the app “might” need. We then built reduced policies based on those findings and tested them in stages. Another thing that helped was setting up CloudTrail with alerting on unusual API calls; it doesn’t have to be perfect on day one, but it gives visibility you can’t function without. For a structured approach, we looked at this resource — aws consulting — because it explains how they design “security-first” environments and the idea of least-privilege combined with proactive monitoring actually clicked for us after that. It’s a slow process, but if you roll out the changes role-by-role and communicate early with the dev teams, it becomes manageable instead of intimidating.
Posted by Waivio guest: @waivio_valensia-romand